EC6422, Linear and Nonlinear Optimization: Dr. Sudhish N. George

EC6422, Linear and Nonlinear Optimization
Dr. Sudhish N. George
National Institute of Technology Calicut
January 8, 2018
Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 1 / 60
Contents
1 Basic Theory of Sets and Functions
2 Fundamentals of Optimization
Basic Theory of Sets and Functions
Basic Theory of Sets and Functions
Sets and Functions
Topics
Sets and Functions
Sets
Sequences
Mappings & Functions
Continuous Functions
Infimum and Supremum of Functions
Maxima and Minima of Functions
Differentiable Functions
Sets
Introduction
Sets
A set A is a collection of any kind which are called the elements or points of A.
The set of real numbers is denoted by R.
If a is an element of set A, we write as a ∈ A, if a is not an element of A, it is
represented as, a ∈/A
Eg: The set of all non-negative real numbers,
Sets
Introduction
Sets
A set A is a collection of any kind which are called the elements or points of A.
The set of real numbers is denoted by R.
If a is an element of set A, we write as a ∈ A, if a is not an element of A, it is
represented as, a ∈/A
Eg: The set of all non-negative real numbers,
A = {x|x ∈ R, x ≥ 0}.
If A and B are two sets and all elements of A are elements of B, we write A ⊂ B,
that is, A is contained in B or that B contains A and write B ⊃ A.
Then, A is called a subset of B and B is called the super-set of A.
If A ⊂ B and B ⊂ A, then A = B.
Sets
Operations on Sets
Basic Operations on Sets
If A and B are two sets, then its union is given as,

A ∪ B = {x|x ∈ A or x ∈ B}.
The intersection of A and B is given as,
A ∩ B = {x|x ∈ A and x ∈ B}.
If A ∩ B = φ, the A and B are said to be disjoint.
The difference between the sets A and B is given by,
A − B = {x|x ∈ A, x ∈/ B}.
The set of all elements do not belong to the set A is called the complement of A
and is denoted as Ac .
The product of two sets A and B is defined as the set of ordered pairs,
A × B = {(x, y)|x ∈ A, y ∈ B}.
Sets
Operations on Sets
Closed and Open Intervals
[a, b] = {a ≤ x ≤ b} ; Closed Interval

(a, b) = {a < x < b} ; Open Interval
[a, b) = {a ≤ x < b} ;Left-half Closed Interval
(a, b] = {a < x ≤ b} ;Right-half Closed Interval
Lower and Upper Bounds
Let S be a non-empty set of real numbers. If there is an number α such that

x ≥ α, ∀x ∈ S, then S is said to be bounded below and α is called the lower bound
of S.
If there is a number β such that x ≤ β, ∀x ∈ S, then S is said to be bounded above
and β is called an upper bound of S.
If S is bounded above and below, then S is said to be bounded.
Sets
Operations on Sets
Greatest Lower Bound and Least Upper Bound
A lower bound ᾱ is the greatest lower bound or infimum of S, (i.e. inf {x|x ∈ S}) if
no number greater than ᾱ is a lower bound of S.
An upper bound β̄ is the least upper bound or supremum of S, (i.e. sup{x|x ∈ S}),
if no number smaller than β̄ is an upper bound of S.
Sets
Topological Properties of Rn
A topology in a set Ω is the family of open sets in Ω if it is closed under operation

of arbitrary unions and finite intersections and contains φ and Ω.
Neighbourhoods: Given a point X0 ∈ Rn and real number, > 0, the set,
N (X0 ) = {X|X ∈ Rn , kX − X0 k < }
is called an neighbourhood of X0 .
N (X0 ) is often called an open ball B (X0 ) with center X0 and radius .
Sets
Interior Points and Open Sets
A point is said to be interior point of the set S ⊂ Rn , if there exists an > 0 such
that N (X) ⊂ S.
The set of all interior points of S is called the interior of S and is denoted by int.S.
Obviously, int.S ⊂ S.
S is called open if S = intS, that is if every point of S is an interior point.
Points of Closure and Closed Sets
The closure of a subset of S in a topological space consists of all points in S plus

the limit points of S.
A limit point of a set S in a topological space Rn is a point X (which is in Rn , but
not necessarily in S) that can be "approximated" by points of S in the sense that
every neighbourhood of X with respect to the topology on Rn also contains a point
of S other than X itself.
Thus, closed set is a superset of S.
Sets
Boundary Points
A point X ∈ Rn is a boundary point of S if it is not an interior point of S, that is, if

for each > 0, N (X) contains atleast one point in S and one point not in S.
The set of all boundary points of S is called the boundary of S.
Bounded Sets
A set S ⊂ Rn is said to be bounded if there exists a real number α > 0 such that
kXk < α, for each X ∈ S.
Sets
Sequences
Sequences
A sequence can be thought of as a list of elements with a particular order.

A sequence in a set S is a function f from the set I of all positive integers into
the set S. If f (n) = xn ∈ S for n ∈ I, we denote the sequence by {xn } or by
x1 , x2 , x3 , ..., the elements x1 , x2 , x3 , ... need not be distinct.
In sequences, repetitions are allowed
Examples:
Prime Numbers: (2, 3, 5, 7, 11, 13, 17, ...)
Fibonacci Numbers: (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...) etc.
Sequences are useful in a number of mathematical disciplines for studying func-
tions, spaces and other mathematical structures using the convergence properties
of sequences.
Sets
Sequences
Limit Point of Sequences
Let {Xn } be a sequence in Rn . A point X̄ ∈ Rn is said to be a limit point of the

sequence if for any given > 0, there is a positive integer N such that,
kXn − X̄k < , for some n ≥ N
A limit point is also known as an accumulation point or a cluster point.
Sets
Sequences
Limit of Sequence
Let X1 , X2 , X3 , ... be a sequence of points in Rn . A point X̄ ∈ Rn is said to be the

limit of the sequence, if for any given > 0, there is a positive integer N such that,
kXn − X̄k < , for all n ≥ N
i.e we say that {Xn } converges to X̄ or X̄ is the limit of {Xn }
Xn → X̄, as n → ∞
lim Xn = X̄
n→∞
The limit of convergent sequence is unique.
Sets
Sequences
Caushy Sequence
A sequence {Xn } in Rn is said to be a Caushy sequence if for any given > 0,

there is a positive integer N such that for all m, n ≤ N ,
kXm − Xn k < .
Sets
Mappings
Let X and Y be two sets. A mapping Γ from X into Y is a correspondence which

associates with every element x of X a subset Γ(x) of Y . The set Γ(x) is called
the image of x under the mapping Γ.
The set X ∗ = {x|x ∈ X , Γ(x) 6= φ} is called the domain of Γ
[
Y∗ = Γ(x)
x∈X
is called the range of Γ.
Sets
Single Valued Functions
If the mapping Γ from a set X into a set Y is such that the image set Γ(x) always
consists of a single element, then, Γ is called single valued mapping of single
valued function.
Multi-valued Functions
Sets
Continuous Functions
A function for which sufficiently small changes in the input result in arbitrarily small
changes in the output.
Let S be a non-empty set in Rn . A function f : S → R is said to be continuous at
X̄ ∈ S, if for any given > 0, there exists a δ > 0 such that,
X ∈ S, kX − X̄k < δ ⇒ |f (X) − f (X̄)| < .
Equivalently,
lim f (Xn ) = f (X̄)

n→∞
.
A vector valued function is said to be continuous at X̄, if each of its components is
continuous at X̄.
Sets
Discontinuous Functions
Many functions, however, will have isolated points where they are not connected.
These problem points are called discontinuities.
Sets
Bounded Functions
Let S ⊂ Rn . A function f : S → R is said to be bounded from below on S if there

exists a number α such that,
f (X) ≥ α, ∀X ∈ S
The number α is the lower bound of f on S.
Function f is said to be bounded from above on S if there exists a number β such
that,
f (X) ≤ β, ∀X ∈ S
The number β is the upper bound of f on S.
The function f is said to be bounded on S if it is bounded from below and above.
Sets
Sets
Minima of Functions
Let f be a real single valued function defined on the set S. If there is point X0 ∈ S
such that f (X0 ) ≤ f (X), ∀X ∈ S is called the minimum of S and is written as,
f (X0 ) = min f (X)

X∈S
and X0 is called a minimum point of f on S.

Thus, if the minimum of f exists, f attains the minimum at the infimum of f on S.
Maxima of Functions
Let f be a real single valued function defined on the set S. If there is point X ∗ ∈ S
such that f (X ∗ ) ≥ f (X), ∀X ∈ S is called the minimum of S and is written as,
f (X ∗ ) = max f (X)
X∈S
and X ∗ is called a maximum point of f on S.

Thus, if the maximum of f exists, f attains the maximum at the supremum of f on
S.
Sets
Figure: Global maxima and minima for cos(3πx)/x, 0.1 ≤ x ≤ 1.1
Sets
A differentiable function of one real variable is a function whose derivative exists at

each point in its domain.
As a result, the graph of a differentiable function must have a (non-vertical) tangent
line at each point in its domain, be relatively smooth, and cannot contain any breaks
More generally, if X0 is a point in the domain of a function f , then f is said to be
differentiable at X0 if the derivative f 0 (X0 ) exists.
A function f is said to be continuously differentiable if the derivative f 0 (X0 ) exists
and is itself a continuous function.
Matrices and Determinants
Follow Lecture Notes
Matrices & Determinants

Linear Transformation & Rank
Quadratic forms & Eigen Value Problems
Please follow the lecture notes
Fundamentals of Optimization
Topics
Feasibility and Optimality

Convexity
The General Optimization Algorithm
Rates of Convergence
Taylor Series
Newton’s Method for Nonlinear Equations
Ref: Igor Griva, Stephen G. Nash, Ariela Sofer, "Linear and Nonlinear Optimization",
SIAM, Second Edn., 2009.
Feasibility
Consider a set of constraints of the form,
gi (x) = 0, i∈E (1)

gi (x) ≥ 0, i ∈ I. (2)
where,
{gi } are given functions that define the constraints in the model
E is an index set for the equality constraints
I is an index set for the inequality constraints
Any set of equations and inequalities can be rearranged in this form.
Examples
g1 (x) = 3x21 + 2x2 − 3x3 + 9 = 0
g2 (x) = −sinx1 + cosx2 ≥ 0
Feasibility
A point that satisfies all the constraints is said to be feasible.

The set of all feasible points is termed the feasible region or feasible set. We shall
denote it by S.
At a feasible point x̄, an inequality constraint gi (x) ≥ 0 is said to be binding or
active, if gi (x̄) = 0, and non-binding or inactive if gi (x̄) > 0.
The point x̄ is said to be on the boundary of the constraint in the former case, and
in the interior of the constraint in the latter.
All equality constraints are regarded as active at any feasible point.
The active set at a feasible point is defined as the set of all constraints that are
active at that point
The set of feasible points for which at least one inequality is binding is called the
boundary of the feasible region.
All other feasible points are interior points.
Feasibility
The feasible region defined by the constraints,
g1 (x) = x1 + 2x2 + 3x3 − 6 = 0

g2 (x) = x1 ≥ 0
g3 (x) = x2 ≥ 0
g4 (x) = x3 ≥ 0
Feasibility
At the feasible point xa = (0, 0, 2)T , the first two inequality constraints x1 ≥ 0 and
x2 ≥ 0 are active, while the third is inactive.
At the point xb = (3, 0, 1)T only the second inequality is active.
While at the interior point xc = (1, 1, 1)T none of the inequalities are active.
The boundary of the feasible region is indicated by bold lines.
Optimality
It may seem surprising that there is any question about what is meant by a "solu-
tion" to an optimization problem.
The confusion arises because there are a variety of conditions associated with an
optimal point and each of these conditions gives rise to a slightly different notion of
a "solution."
Let us consider the n-dimensional problem;
min f (x)
x∈S
.
There is no fundamental difference between minimization and maximization prob-
lems. We can maximize f by solving
min −f (x)
x∈S
and then multiplying the optimal objective value by -1.
Optimality
The set S of feasible points is usually defined by a set of constraints

The most basic definition of a solution is that x∗ minimizes f if,
f (x∗ ) ≤ f (x), ∀x ∈ S
The point x∗ is referred to as a global minimizer of f in S. If in addition x∗ satisfies,
f (x∗ ) < f (x), ∀x ∈ S, such that, x 6= x∗
then, then x∗ is a strict global minimizer.
Optimality
If we cannot find the global solution, then at the least we would like to find a point
that is better than its surrounding points.
More precisely, we would like to find a local minimizer of f in S, a point satisfying,
f (x∗ ) ≤ f (x), ∀x ∈ S, such that, kx − x∗ k <
Here is some small positive number that may depend on x∗ .

The point x∗ is a strict local minimizer if,
f (x∗ ) < f (x), ∀x ∈ S, such that, x 6= x∗ and kx − x∗ k <
Convexity
Convexity
There is one important case where global solutions can be found, the case where
the objective function is a convex function and the feasible region is a convex set.
A set S is convex if, for any elements x and y of S,
αx + (1 − α)y ∈ S, ∀, 0 ≤ α ≤ 1
In other words, if x and y are in S, then the line segment connecting x and y is also
in S.
More generally, every set defined by a system of linear constraints is a convex set.
Convexity
Convexity
A function f is convex on a convex set S if it satisfies
f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y)
∀, 0 ≤ α ≤ 1 and ∀, x, y ∈ S.
This definition says that the line segment connecting the points (x, f (x)) and (y, f (y))
lies on or above the graph of the function
Intuitively, the graph of the function is bowl shaped.
Convexity
Convexity
Analogously, a function is concave on S if it satisfies
f (αx + (1 − α)y) ≥ αf (x) + (1 − α)f (y)
∀, 0 ≤ α ≤ 1 and ∀, x, y ∈ S.
We say that a function is strictly convex if,
f (αx + (1 − α)y) < αf (x) + (1 − α)f (y)
∀, 0 < α < 1 and ∀, x, y ∈ S.

We define a convex optimization problem to be a problem of the form,
min f (x)
x∈S
where, S is a convex set and f is a convex function on S.

A problem,
min f (x)
subject to gi (x) ≥ 0, i = 1, ..., m,
is a convex optimization problem if f is convex and the functions {gi } are concave.
Convexity
Convexity
Theorem (Global Solutions of Convex Optimization Problems): Let x∗ be a local

minimizer of a convex optimization problem. Then, x∗ is also a global minimizer. If
the objective function is strictly convex, then x∗ is the unique global minimizer.
Proof is self study
Convexity
Convexity
A convex combination is a linear combination whose coefficients are non-negative

and sum to one.
Algebraically, the point y is a convex combination of the points {xi }ki=1 if,
k
X
y= α i xi
i=1
where,
k
X
αi = 1
i=1
and αi ≥ 0, i = 1, ..., k
There will normally be many ways in which y can be expressed as a convex com-
bination of {xi }.
Convexity
Derivatives and Convexity
A one-dimensional function f is convex if and only if,
f ”(x) ≥ 0, ∀x ∈ S
Examples
f (x) = x4 → Convex function
f (x) = sinx → neither convex nor concave
In the multidimensional case, the Hessian matrix of second derivatives must be
positive semidefinite; that is, at every point x ∈ S,
y T ∇2 f (x)y ≥ 0, ∀y
In the multidimensional case, if the Hessian matrix ∇2 f (x) is positive definite for
all x ∈ S, then the function is strictly convex on S.
Alternatively, it would have been possible to show that the eigenvalues of the Hes-
sian matrix were all greater than or equal to zero.
Eg: f (x1, x2) = 4x21 + 12x1 x2 + 9x22
Convexity
Convexity
Function f with one continuous derivative is convex over a convex set S if and only
if it satisfies,
f (y) ≥ f (x) + ∇f (x)T (y − x), ∀x, y ∈ S
Proof is self study
This property states that the function is on or above any of its tangents.
When we refer to computing a "solution" we most always mean an approximate

solution, an element of this sequence that has sufficient accuracy.
This algorithm is so simple that it almost conveys no information at all.
If we are trying to solve the one-dimensional problem without constraints,
min f (x)
then, the optimality test will often be based on the condition f 0 (x) = 0
If f 0 (xk ) 6= 0, then xk is not optimal, and the sign and value of f 0 (xk ) indicates
whether f is increasing or decreasing at the point xk , as well as how rapidly f is
changing.
In this algorithm, pk is a search direction that we hope points in the general direction
of the solution, or that "improves" our solution in some sense.
The scalar αk is a step length that determines the point xk+1 .
Once the search direction pk has been computed, the step length αk is found by
solving some auxiliary one-dimensional problem
For an unconstrained problem of the form, we will typically require that the search
direction pk be a descent direction for the function f at the point xk .
This means that for "small" steps taken along pk the function value is guaranteed
to decrease for some :
f (xk + αpk ) < f (xk ), 0<α≤
For a linear function, f (x) = cT x, pk is a descent direction if,
cT (xk + pk ) = cT xk + cT pk < cT xk
or in other words if cT pk < 0.
With pk available, we would ideally like to determine the step length αk so as to

minimize the function in that direction:
min f (xk + αpk )

α≥0
The calculation of αk is called a line search because it corresponds to a search

along the line xk + αpk defined by α.
Algorithm II with its three major steps

the optimality test
computation of pk
computation of αk
has been the basis for a great many of the most successful optimization algorithms
ever developed.
Using the concept of descent directions, we can establish an important condition
for optimality for the constrained problem
min
x∈S
We define p to be a feasible descent direction at a point xk ∈ S if, for some > 0,
xk + αp ∈ S and f (xk + αp) < f (xk ), , ∀0 < α ≤
If a feasible descent direction exists at a point xk , then it is possible to move a short

distance along this direction to a feasible point with a better objective value.
Many of the algorithms do not find a solution in a finite number of steps.

Instead these algorithms compute a sequence of approximate solutions that we
hope get closer and closer to a solution.
When discussing such an algorithm, the following two questions are often asked:
Does it converge?
How fast does it converge?
If an algorithm converges in a finite number of steps, the cost of that algorithm
is often measured by counting the number of steps required, or by counting the
number of arithmetic operations required.
This cost is referred to as the computational complexity of the algorithm.
For many optimization methods, the number of operations or steps required to find
an exact solution will be infinite.
The rate of convergence describes how quickly the estimates of the solution ap-
proach the exact solution.
Let us assume that we have a sequence of points xk converging to a solution x∗ .
We define the sequence of errors to be,
ek = xk − x∗ such that lim ek = 0

k→∞
We say that the sequence {xk } converges to x∗ with rate r and rate constant C if
kek+1 k
lim =C
k→∞ kek kr
where, C < ∞
Taylor Series
Taylor Series
The Taylor series is a tool for approximating a function f near a specified point x0 .
It has many uses:
It allows you to estimate the value of the function near the given point (when the function
is difficult to evaluate directly).
The derivatives and integral of the approximation can be used to estimate the derivatives
and integral of the original function.
It is used to derive many algorithms for finding zeroes of functions, for minimizing func-
tions, etc.
Let x0 be a specified point. Then the nth order Taylor series approximation is,
1 2 pn (n)
f (x0 + p) ≈ f (x0 ) + pf 0 (x0 ) + p f ”(x0 ) + ... + f (x0 )
2 n!
Taylor Series
Taylor Series
Taylor Series
Taylor Series
The first two terms of the Taylor series give us the formula for the tangent line for
the function f at the point x0 .
y = f (x0 ) + (x − x0 )f 0 (x0 ).
where p = x − x0 .
The first three terms of the Taylor series give a quadratic approximation to the
function f at the point x0 .
Taylor Series
Taylor Series
For n variables,
1 T 2
f (x0 + p) ≈ f (x0 ) + pT ∇f (x0 ) + p ∇ f (x0 ) + ...
2
Taylor Series
Let us now consider methods for solving
f (x) = 0
Given an estimate of the solution xk , the non-linear function f is approximated

by the linear function consisting of the first two terms of the Taylor series for the
function f at the point xk .
The resulting linear system is then solved to obtain a new estimate of the solution
xk+1 .
To derive the formulas for Newton’s method, we first write out the Taylor series for
the function f at the point xk
f (xk + p) ≈ f (xk ) + pf 0 (xk )
If f 0 (xk ) 6= 0, then we can solve the equation
f (x∗ ) ≈ f (xk ) + pf 0 (xk ) = 0
for p to obtain
f (xk )
p=−
f 0 (xk )
The new estimate of the solution is then xk+1 = xk +p or xk+1 = xk −f (xk )/f 0 (xk ).
This is the formula for Newton’s method
Newton’s method corresponds to approximating the function f by its tangent line at
the point xk .
The point where the tangent line crosses the x-axis (i.e., a zero of the tangent line)
is taken as the new estimate of the solution.
Theorem (Convergence of Newton’s Method). Assume that the function f (x) has
two continuous derivatives. Let x∗ be a zero of f with f 0 (x∗ ) 6= 0. If |x0 − x∗ | is
sufficiently small, then the sequence defined by.
xk+1 = xk − f (xk )/f 0 (xk )
converges quadratically to x∗ with rate constant
f ”(x∗ )
C=| |
2f 0 (x∗ )
Systems of Nonlinear Equations
Much of the discussion in the one-dimensional case can be transferred with only
minor changes to the n-dimensional case.
Suppose that we are solving f (x) = 0, which represents,
f1 (x1 , ..., xn ) = 0
f2 (x1 , ..., xn ) = 0
...
...
fn (x1 , ..., xn ) = 0
The new estimate of the solution is then,
xk+1 = xk + p = xk − 5f (xk )−T f (xk )
where 5f (x) is the transpose of the Jacobian of f at the point x.

EC6422, Linear and Nonlinear Optimization: Dr. Sudhish N. George

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

EC6422, Linear and Nonlinear Optimization: Dr. Sudhish N. George

Enviado por

Direitos autorais:

Formatos disponíveis

EC6422, Linear and Nonlinear Optimization

Dr. Sudhish N. George

National Institute of Technology Calicut

1 Basic Theory of Sets and Functions

Basic Theory of Sets and Functions

Sets and Functions

Basic Operations on Sets

If A and B are two sets, then its union is given as,

Closed and Open Intervals

[a, b] = {a ≤ x ≤ b} ; Closed Interval

Lower and Upper Bounds

Let S be a non-empty set of real numbers. If there is an number α such that

Greatest Lower Bound and Least Upper Bound

A topology in a set Ω is the family of open sets in Ω if it is closed under operation

Interior Points and Open Sets

Points of Closure and Closed Sets

The closure of a subset of S in a topological space consists of all points in S plus

A point X ∈ Rn is a boundary point of S if it is not an interior point of S, that is, if

A sequence can be thought of as a list of elements with a particular order.

Limit Point of Sequences

Let {Xn } be a sequence in Rn . A point X̄ ∈ Rn is said to be a limit point of the

Let X1 , X2 , X3 , ... be a sequence of points in Rn . A point X̄ ∈ Rn is said to be the

The limit of convergent sequence is unique.

A sequence {Xn } in Rn is said to be a Caushy sequence if for any given  > 0,

Let X and Y be two sets. A mapping Γ from X into Y is a correspondence which

is called the range of Γ.

Single Valued Functions

lim f (Xn ) = f (X̄)

Let S ⊂ Rn . A function f : S → R is said to be bounded from below on S if there

f (X0 ) = min f (X)

and X0 is called a minimum point of f on S.

and X ∗ is called a maximum point of f on S.

Figure: Global maxima and minima for cos(3πx)/x, 0.1 ≤ x ≤ 1.1

A differentiable function of one real variable is a function whose derivative exists at

Follow Lecture Notes

Matrices & Determinants

Please follow the lecture notes

Feasibility and Optimality

Consider a set of constraints of the form,

gi (x) = 0, i∈E (1)

A point that satisfies all the constraints is said to be feasible.

The feasible region defined by the constraints,

g1 (x) = x1 + 2x2 + 3x3 − 6 = 0

and then multiplying the optimal objective value by -1.

The set S of feasible points is usually defined by a set of constraints

The point x∗ is referred to as a global minimizer of f in S. If in addition x∗ satisfies,

f (x∗ ) < f (x), ∀x ∈ S, such that, x 6= x∗

then, then x∗ is a strict global minimizer.

f (x∗ ) ≤ f (x), ∀x ∈ S, such that, kx − x∗ k < 

Here  is some small positive number that may depend on x∗ .

f (x∗ ) < f (x), ∀x ∈ S, such that, x 6= x∗ and kx − x∗ k < 

A function f is convex on a convex set S if it satisfies

f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y)

Analogously, a function is concave on S if it satisfies

f (αx + (1 − α)y) ≥ αf (x) + (1 − α)f (y)

f (αx + (1 − α)y) < αf (x) + (1 − α)f (y)

∀, 0 < α < 1 and ∀, x, y ∈ S.

where, S is a convex set and f is a convex function on S.

Theorem (Global Solutions of Convex Optimization Problems): Let x∗ be a local

Proof is self study

A convex combination is a linear combination whose coefficients are non-negative

Derivatives and Convexity

A sequence {Xn } in Rn is said to be a Caushy sequence if for any given > 0,

f (x∗ ) ≤ f (x), ∀x ∈ S, such that, kx − x∗ k <

Here is some small positive number that may depend on x∗ .

f (x∗ ) < f (x), ∀x ∈ S, such that, x 6= x∗ and kx − x∗ k <

f (xk + αpk ) < f (xk ), 0<α≤

cT (xk + pk ) = cT xk + cT pk < cT xk

We define p to be a feasible descent direction at a point xk ∈ S if, for some > 0,

xk + αp ∈ S and f (xk + αp) < f (xk ), , ∀0 < α ≤