Você está na página 1de 63

EC6422, Linear and Nonlinear Optimization

Dr. Sudhish N. George

National Institute of Technology Calicut

January 8, 2018

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 1 / 60
Contents

1 Basic Theory of Sets and Functions

2 Fundamentals of Optimization

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 2 / 60
Basic Theory of Sets and Functions

Basic Theory of Sets and Functions

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 2 / 60
Sets and Functions
Topics

Sets and Functions

Sets
Sequences
Mappings & Functions
Continuous Functions
Infimum and Supremum of Functions
Maxima and Minima of Functions
Differentiable Functions

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 3 / 60
Sets
Introduction

Sets

A set A is a collection of any kind which are called the elements or points of A.
The set of real numbers is denoted by R.
If a is an element of set A, we write as a ∈ A, if a is not an element of A, it is
represented as, a ∈/A
Eg: The set of all non-negative real numbers,

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 4 / 60
Sets
Introduction

Sets

A set A is a collection of any kind which are called the elements or points of A.
The set of real numbers is denoted by R.
If a is an element of set A, we write as a ∈ A, if a is not an element of A, it is
represented as, a ∈/A
Eg: The set of all non-negative real numbers,
A = {x|x ∈ R, x ≥ 0}.
If A and B are two sets and all elements of A are elements of B, we write A ⊂ B,
that is, A is contained in B or that B contains A and write B ⊃ A.
Then, A is called a subset of B and B is called the super-set of A.
If A ⊂ B and B ⊂ A, then A = B.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 4 / 60
Sets
Operations on Sets

Basic Operations on Sets

If A and B are two sets, then its union is given as,


A ∪ B = {x|x ∈ A or x ∈ B}.
The intersection of A and B is given as,
A ∩ B = {x|x ∈ A and x ∈ B}.
If A ∩ B = φ, the A and B are said to be disjoint.
The difference between the sets A and B is given by,
A − B = {x|x ∈ A, x ∈/ B}.
The set of all elements do not belong to the set A is called the complement of A
and is denoted as Ac .
The product of two sets A and B is defined as the set of ordered pairs,
A × B = {(x, y)|x ∈ A, y ∈ B}.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 5 / 60
Sets
Operations on Sets

Closed and Open Intervals

[a, b] = {a ≤ x ≤ b} ; Closed Interval


(a, b) = {a < x < b} ; Open Interval
[a, b) = {a ≤ x < b} ;Left-half Closed Interval
(a, b] = {a < x ≤ b} ;Right-half Closed Interval

Lower and Upper Bounds

Let S be a non-empty set of real numbers. If there is an number α such that


x ≥ α, ∀x ∈ S, then S is said to be bounded below and α is called the lower bound
of S.
If there is a number β such that x ≤ β, ∀x ∈ S, then S is said to be bounded above
and β is called an upper bound of S.
If S is bounded above and below, then S is said to be bounded.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 6 / 60
Sets
Operations on Sets

Greatest Lower Bound and Least Upper Bound

A lower bound ᾱ is the greatest lower bound or infimum of S, (i.e. inf {x|x ∈ S}) if
no number greater than ᾱ is a lower bound of S.
An upper bound β̄ is the least upper bound or supremum of S, (i.e. sup{x|x ∈ S}),
if no number smaller than β̄ is an upper bound of S.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 7 / 60
Sets
Topological Properties of Rn

Topological Properties of Rn

A topology in a set Ω is the family of open sets in Ω if it is closed under operation


of arbitrary unions and finite intersections and contains φ and Ω.
Neighbourhoods: Given a point X0 ∈ Rn and real number,  > 0, the set,
N (X0 ) = {X|X ∈ Rn , kX − X0 k <  }
is called an  neighbourhood of X0 .
N (X0 ) is often called an open ball B (X0 ) with center X0 and radius .

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 8 / 60
Sets
Topological Properties of Rn

Interior Points and Open Sets

A point is said to be interior point of the set S ⊂ Rn , if there exists an  > 0 such
that N (X) ⊂ S.
The set of all interior points of S is called the interior of S and is denoted by int.S.
Obviously, int.S ⊂ S.
S is called open if S = intS, that is if every point of S is an interior point.

Points of Closure and Closed Sets

The closure of a subset of S in a topological space consists of all points in S plus


the limit points of S.
A limit point of a set S in a topological space Rn is a point X (which is in Rn , but
not necessarily in S) that can be "approximated" by points of S in the sense that
every neighbourhood of X with respect to the topology on Rn also contains a point
of S other than X itself.
Thus, closed set is a superset of S.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 9 / 60
Sets
Topological Properties of Rn

Boundary Points

A point X ∈ Rn is a boundary point of S if it is not an interior point of S, that is, if


for each  > 0, N (X) contains atleast one point in S and one point not in S.
The set of all boundary points of S is called the boundary of S.

Bounded Sets

A set S ⊂ Rn is said to be bounded if there exists a real number α > 0 such that
kXk < α, for each X ∈ S.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 10 / 60
Sets
Sequences

Sequences

A sequence can be thought of as a list of elements with a particular order.


A sequence in a set S is a function f from the set I of all positive integers into
the set S. If f (n) = xn ∈ S for n ∈ I, we denote the sequence by {xn } or by
x1 , x2 , x3 , ..., the elements x1 , x2 , x3 , ... need not be distinct.
In sequences, repetitions are allowed
Examples:
Prime Numbers: (2, 3, 5, 7, 11, 13, 17, ...)
Fibonacci Numbers: (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...) etc.
Sequences are useful in a number of mathematical disciplines for studying func-
tions, spaces and other mathematical structures using the convergence properties
of sequences.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 11 / 60
Sets
Sequences

Limit Point of Sequences

Let {Xn } be a sequence in Rn . A point X̄ ∈ Rn is said to be a limit point of the


sequence if for any given  > 0, there is a positive integer N such that,
kXn − X̄k < , for some n ≥ N
A limit point is also known as an accumulation point or a cluster point.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 12 / 60
Sets
Sequences

Limit of Sequence

Let X1 , X2 , X3 , ... be a sequence of points in Rn . A point X̄ ∈ Rn is said to be the


limit of the sequence, if for any given  > 0, there is a positive integer N such that,
kXn − X̄k < , for all n ≥ N
i.e we say that {Xn } converges to X̄ or X̄ is the limit of {Xn }
Xn → X̄, as n → ∞

lim Xn = X̄
n→∞

The limit of convergent sequence is unique.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 13 / 60
Sets
Sequences

Caushy Sequence

A sequence {Xn } in Rn is said to be a Caushy sequence if for any given  > 0,


there is a positive integer N such that for all m, n ≤ N ,
kXm − Xn k < .

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 14 / 60
Sets
Mappings & Functions

Mappings

Let X and Y be two sets. A mapping Γ from X into Y is a correspondence which


associates with every element x of X a subset Γ(x) of Y . The set Γ(x) is called
the image of x under the mapping Γ.
The set X ∗ = {x|x ∈ X , Γ(x) 6= φ} is called the domain of Γ
[
Y∗ = Γ(x)
x∈X

is called the range of Γ.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 15 / 60
Sets
Mappings & Functions

Single Valued Functions

If the mapping Γ from a set X into a set Y is such that the image set Γ(x) always
consists of a single element, then, Γ is called single valued mapping of single
valued function.

Multi-valued Functions

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 16 / 60
Sets
Mappings & Functions

Continuous Functions

A function for which sufficiently small changes in the input result in arbitrarily small
changes in the output.
Let S be a non-empty set in Rn . A function f : S → R is said to be continuous at
X̄ ∈ S, if for any given  > 0, there exists a δ > 0 such that,
X ∈ S, kX − X̄k < δ ⇒ |f (X) − f (X̄)| < .
Equivalently,

lim f (Xn ) = f (X̄)


n→∞

.
A vector valued function is said to be continuous at X̄, if each of its components is
continuous at X̄.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 17 / 60
Sets
Mappings & Functions

Discontinuous Functions

Many functions, however, will have isolated points where they are not connected.
These problem points are called discontinuities.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 18 / 60
Sets
Infimum and Supremum of Functions

Bounded Functions

Let S ⊂ Rn . A function f : S → R is said to be bounded from below on S if there


exists a number α such that,
f (X) ≥ α, ∀X ∈ S
The number α is the lower bound of f on S.
Function f is said to be bounded from above on S if there exists a number β such
that,
f (X) ≤ β, ∀X ∈ S
The number β is the upper bound of f on S.
The function f is said to be bounded on S if it is bounded from below and above.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 19 / 60
Sets
Infimum and Supremum of Functions

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 20 / 60
Sets
Maxima and Minima of Functions

Minima of Functions

Let f be a real single valued function defined on the set S. If there is point X0 ∈ S
such that f (X0 ) ≤ f (X), ∀X ∈ S is called the minimum of S and is written as,

f (X0 ) = min f (X)


X∈S

and X0 is called a minimum point of f on S.


Thus, if the minimum of f exists, f attains the minimum at the infimum of f on S.

Maxima of Functions

Let f be a real single valued function defined on the set S. If there is point X ∗ ∈ S
such that f (X ∗ ) ≥ f (X), ∀X ∈ S is called the minimum of S and is written as,

f (X ∗ ) = max f (X)
X∈S

and X ∗ is called a maximum point of f on S.


Thus, if the maximum of f exists, f attains the maximum at the supremum of f on
S.
Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 21 / 60
Sets
Maxima and Minima of Functions

Figure: Global maxima and minima for cos(3πx)/x, 0.1 ≤ x ≤ 1.1

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 22 / 60
Sets
Differentiable Functions

Differentiable Functions

A differentiable function of one real variable is a function whose derivative exists at


each point in its domain.
As a result, the graph of a differentiable function must have a (non-vertical) tangent
line at each point in its domain, be relatively smooth, and cannot contain any breaks
More generally, if X0 is a point in the domain of a function f , then f is said to be
differentiable at X0 if the derivative f 0 (X0 ) exists.
A function f is said to be continuously differentiable if the derivative f 0 (X0 ) exists
and is itself a continuous function.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 23 / 60
Matrices and Determinants

Follow Lecture Notes

Matrices & Determinants


Linear Transformation & Rank
Quadratic forms & Eigen Value Problems

Please follow the lecture notes

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 24 / 60
Fundamentals of Optimization

Fundamentals of Optimization

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 24 / 60
Fundamentals of Optimization
Topics

Fundamentals of Optimization

Feasibility and Optimality


Convexity
The General Optimization Algorithm
Rates of Convergence
Taylor Series
Newton’s Method for Nonlinear Equations

Ref: Igor Griva, Stephen G. Nash, Ariela Sofer, "Linear and Nonlinear Optimization",
SIAM, Second Edn., 2009.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 25 / 60
Fundamentals of Optimization
Feasibility and Optimality

Feasibility

Consider a set of constraints of the form,

gi (x) = 0, i∈E (1)


gi (x) ≥ 0, i ∈ I. (2)

where,
{gi } are given functions that define the constraints in the model
E is an index set for the equality constraints
I is an index set for the inequality constraints
Any set of equations and inequalities can be rearranged in this form.
Examples
g1 (x) = 3x21 + 2x2 − 3x3 + 9 = 0
g2 (x) = −sinx1 + cosx2 ≥ 0

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 26 / 60
Fundamentals of Optimization
Feasibility and Optimality

Feasibility

A point that satisfies all the constraints is said to be feasible.


The set of all feasible points is termed the feasible region or feasible set. We shall
denote it by S.
At a feasible point x̄, an inequality constraint gi (x) ≥ 0 is said to be binding or
active, if gi (x̄) = 0, and non-binding or inactive if gi (x̄) > 0.
The point x̄ is said to be on the boundary of the constraint in the former case, and
in the interior of the constraint in the latter.
All equality constraints are regarded as active at any feasible point.
The active set at a feasible point is defined as the set of all constraints that are
active at that point
The set of feasible points for which at least one inequality is binding is called the
boundary of the feasible region.
All other feasible points are interior points.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 27 / 60
Fundamentals of Optimization
Feasibility and Optimality

Feasibility

The feasible region defined by the constraints,

g1 (x) = x1 + 2x2 + 3x3 − 6 = 0


g2 (x) = x1 ≥ 0
g3 (x) = x2 ≥ 0
g4 (x) = x3 ≥ 0

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 28 / 60
Fundamentals of Optimization
Feasibility and Optimality

Feasibility

At the feasible point xa = (0, 0, 2)T , the first two inequality constraints x1 ≥ 0 and
x2 ≥ 0 are active, while the third is inactive.
At the point xb = (3, 0, 1)T only the second inequality is active.
While at the interior point xc = (1, 1, 1)T none of the inequalities are active.
The boundary of the feasible region is indicated by bold lines.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 29 / 60
Fundamentals of Optimization
Feasibility and Optimality

Optimality

It may seem surprising that there is any question about what is meant by a "solu-
tion" to an optimization problem.
The confusion arises because there are a variety of conditions associated with an
optimal point and each of these conditions gives rise to a slightly different notion of
a "solution."
Let us consider the n-dimensional problem;

min f (x)
x∈S
.
There is no fundamental difference between minimization and maximization prob-
lems. We can maximize f by solving

min −f (x)
x∈S

and then multiplying the optimal objective value by -1.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 30 / 60
Fundamentals of Optimization
Feasibility and Optimality

Optimality

The set S of feasible points is usually defined by a set of constraints


The most basic definition of a solution is that x∗ minimizes f if,

f (x∗ ) ≤ f (x), ∀x ∈ S

The point x∗ is referred to as a global minimizer of f in S. If in addition x∗ satisfies,

f (x∗ ) < f (x), ∀x ∈ S, such that, x 6= x∗

then, then x∗ is a strict global minimizer.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 31 / 60
Fundamentals of Optimization
Feasibility and Optimality

Optimality

If we cannot find the global solution, then at the least we would like to find a point
that is better than its surrounding points.
More precisely, we would like to find a local minimizer of f in S, a point satisfying,

f (x∗ ) ≤ f (x), ∀x ∈ S, such that, kx − x∗ k < 

Here  is some small positive number that may depend on x∗ .


The point x∗ is a strict local minimizer if,

f (x∗ ) < f (x), ∀x ∈ S, such that, x 6= x∗ and kx − x∗ k < 

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 32 / 60
Fundamentals of Optimization
Convexity

Convexity

There is one important case where global solutions can be found, the case where
the objective function is a convex function and the feasible region is a convex set.
A set S is convex if, for any elements x and y of S,

αx + (1 − α)y ∈ S, ∀, 0 ≤ α ≤ 1

In other words, if x and y are in S, then the line segment connecting x and y is also
in S.
More generally, every set defined by a system of linear constraints is a convex set.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 33 / 60
Fundamentals of Optimization
Convexity

Convexity

A function f is convex on a convex set S if it satisfies

f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y)

∀, 0 ≤ α ≤ 1 and ∀, x, y ∈ S.
This definition says that the line segment connecting the points (x, f (x)) and (y, f (y))
lies on or above the graph of the function
Intuitively, the graph of the function is bowl shaped.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 34 / 60
Fundamentals of Optimization
Convexity

Convexity

Analogously, a function is concave on S if it satisfies

f (αx + (1 − α)y) ≥ αf (x) + (1 − α)f (y)

∀, 0 ≤ α ≤ 1 and ∀, x, y ∈ S.
We say that a function is strictly convex if,

f (αx + (1 − α)y) < αf (x) + (1 − α)f (y)

∀, 0 < α < 1 and ∀, x, y ∈ S.


We define a convex optimization problem to be a problem of the form,

min f (x)
x∈S

where, S is a convex set and f is a convex function on S.


A problem,

min f (x)
subject to gi (x) ≥ 0, i = 1, ..., m,

is a convex optimization problem if f is convex and the functions {gi } are concave.
Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 35 / 60
Fundamentals of Optimization
Convexity

Convexity

Theorem (Global Solutions of Convex Optimization Problems): Let x∗ be a local


minimizer of a convex optimization problem. Then, x∗ is also a global minimizer. If
the objective function is strictly convex, then x∗ is the unique global minimizer.

Proof is self study

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 36 / 60
Fundamentals of Optimization
Convexity

Convexity

A convex combination is a linear combination whose coefficients are non-negative


and sum to one.
Algebraically, the point y is a convex combination of the points {xi }ki=1 if,

k
X
y= α i xi
i=1

where,

k
X
αi = 1
i=1

and αi ≥ 0, i = 1, ..., k
There will normally be many ways in which y can be expressed as a convex com-
bination of {xi }.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 37 / 60
Fundamentals of Optimization
Convexity

Derivatives and Convexity

A one-dimensional function f is convex if and only if,

f ”(x) ≥ 0, ∀x ∈ S
Examples
f (x) = x4 → Convex function
f (x) = sinx → neither convex nor concave
In the multidimensional case, the Hessian matrix of second derivatives must be
positive semidefinite; that is, at every point x ∈ S,

y T ∇2 f (x)y ≥ 0, ∀y

In the multidimensional case, if the Hessian matrix ∇2 f (x) is positive definite for
all x ∈ S, then the function is strictly convex on S.
Alternatively, it would have been possible to show that the eigenvalues of the Hes-
sian matrix were all greater than or equal to zero.
Eg: f (x1, x2) = 4x21 + 12x1 x2 + 9x22

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 38 / 60
Fundamentals of Optimization
Convexity

Derivatives and Convexity

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 39 / 60
Fundamentals of Optimization
Convexity

Derivatives and Convexity

Function f with one continuous derivative is convex over a convex set S if and only
if it satisfies,
f (y) ≥ f (x) + ∇f (x)T (y − x), ∀x, y ∈ S
Proof is self study
This property states that the function is on or above any of its tangents.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 40 / 60
Fundamentals of Optimization
The General Optimization Algorithm

The General Optimization Algorithm

When we refer to computing a "solution" we most always mean an approximate


solution, an element of this sequence that has sufficient accuracy.
This algorithm is so simple that it almost conveys no information at all.
If we are trying to solve the one-dimensional problem without constraints,

min f (x)

then, the optimality test will often be based on the condition f 0 (x) = 0
If f 0 (xk ) 6= 0, then xk is not optimal, and the sign and value of f 0 (xk ) indicates
whether f is increasing or decreasing at the point xk , as well as how rapidly f is
changing.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 41 / 60
Fundamentals of Optimization
The General Optimization Algorithm

The General Optimization Algorithm

In this algorithm, pk is a search direction that we hope points in the general direction
of the solution, or that "improves" our solution in some sense.
The scalar αk is a step length that determines the point xk+1 .
Once the search direction pk has been computed, the step length αk is found by
solving some auxiliary one-dimensional problem

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 42 / 60
Fundamentals of Optimization
The General Optimization Algorithm

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 43 / 60
Fundamentals of Optimization
The General Optimization Algorithm

The General Optimization Algorithm

For an unconstrained problem of the form, we will typically require that the search
direction pk be a descent direction for the function f at the point xk .
This means that for "small" steps taken along pk the function value is guaranteed
to decrease for some :

f (xk + αpk ) < f (xk ), 0<α≤

For a linear function, f (x) = cT x, pk is a descent direction if,

cT (xk + pk ) = cT xk + cT pk < cT xk

or in other words if cT pk < 0.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 44 / 60
Fundamentals of Optimization
The General Optimization Algorithm

The General Optimization Algorithm

With pk available, we would ideally like to determine the step length αk so as to


minimize the function in that direction:

min f (xk + αpk )


α≥0

The calculation of αk is called a line search because it corresponds to a search


along the line xk + αpk defined by α.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 45 / 60
Fundamentals of Optimization
The General Optimization Algorithm

The General Optimization Algorithm

Algorithm II with its three major steps


the optimality test
computation of pk
computation of αk
has been the basis for a great many of the most successful optimization algorithms
ever developed.
Using the concept of descent directions, we can establish an important condition
for optimality for the constrained problem

min
x∈S

We define p to be a feasible descent direction at a point xk ∈ S if, for some  > 0,

xk + αp ∈ S and f (xk + αp) < f (xk ), , ∀0 < α ≤ 

If a feasible descent direction exists at a point xk , then it is possible to move a short


distance along this direction to a feasible point with a better objective value.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 46 / 60
Fundamentals of Optimization
Rates of Convergence

Rates of Convergence

Many of the algorithms do not find a solution in a finite number of steps.


Instead these algorithms compute a sequence of approximate solutions that we
hope get closer and closer to a solution.
When discussing such an algorithm, the following two questions are often asked:
Does it converge?
How fast does it converge?
If an algorithm converges in a finite number of steps, the cost of that algorithm
is often measured by counting the number of steps required, or by counting the
number of arithmetic operations required.
This cost is referred to as the computational complexity of the algorithm.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 47 / 60
Fundamentals of Optimization
Rates of Convergence

Rates of Convergence

For many optimization methods, the number of operations or steps required to find
an exact solution will be infinite.
The rate of convergence describes how quickly the estimates of the solution ap-
proach the exact solution.
Let us assume that we have a sequence of points xk converging to a solution x∗ .
We define the sequence of errors to be,

ek = xk − x∗ such that lim ek = 0


k→∞

We say that the sequence {xk } converges to x∗ with rate r and rate constant C if

kek+1 k
lim =C
k→∞ kek kr

where, C < ∞

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 48 / 60
Fundamentals of Optimization
Rates of Convergence

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 49 / 60
Fundamentals of Optimization
Rates of Convergence

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 50 / 60
Fundamentals of Optimization
Rates of Convergence

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 51 / 60
Fundamentals of Optimization
Taylor Series

Taylor Series

The Taylor series is a tool for approximating a function f near a specified point x0 .
It has many uses:
It allows you to estimate the value of the function near the given point (when the function
is difficult to evaluate directly).
The derivatives and integral of the approximation can be used to estimate the derivatives
and integral of the original function.
It is used to derive many algorithms for finding zeroes of functions, for minimizing func-
tions, etc.
Let x0 be a specified point. Then the nth order Taylor series approximation is,

1 2 pn (n)
f (x0 + p) ≈ f (x0 ) + pf 0 (x0 ) + p f ”(x0 ) + ... + f (x0 )
2 n!

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 52 / 60
Fundamentals of Optimization
Taylor Series

Taylor Series

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 53 / 60
Fundamentals of Optimization
Taylor Series

Taylor Series

The first two terms of the Taylor series give us the formula for the tangent line for
the function f at the point x0 .

y = f (x0 ) + (x − x0 )f 0 (x0 ).

where p = x − x0 .
The first three terms of the Taylor series give a quadratic approximation to the
function f at the point x0 .

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 54 / 60
Fundamentals of Optimization
Taylor Series

Taylor Series

For n variables,
1 T 2
f (x0 + p) ≈ f (x0 ) + pT ∇f (x0 ) + p ∇ f (x0 ) + ...
2

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 55 / 60
Fundamentals of Optimization
Taylor Series

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 56 / 60
Fundamentals of Optimization
Newton’s Method for Nonlinear Equations

Newton’s Method for Nonlinear Equations

Let us now consider methods for solving

f (x) = 0

Given an estimate of the solution xk , the non-linear function f is approximated


by the linear function consisting of the first two terms of the Taylor series for the
function f at the point xk .
The resulting linear system is then solved to obtain a new estimate of the solution
xk+1 .
To derive the formulas for Newton’s method, we first write out the Taylor series for
the function f at the point xk

f (xk + p) ≈ f (xk ) + pf 0 (xk )

If f 0 (xk ) 6= 0, then we can solve the equation

f (x∗ ) ≈ f (xk ) + pf 0 (xk ) = 0

for p to obtain
f (xk )
p=−
f 0 (xk )

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 57 / 60
Fundamentals of Optimization
Newton’s Method for Nonlinear Equations

Newton’s Method for Nonlinear Equations

The new estimate of the solution is then xk+1 = xk +p or xk+1 = xk −f (xk )/f 0 (xk ).
This is the formula for Newton’s method
Newton’s method corresponds to approximating the function f by its tangent line at
the point xk .
The point where the tangent line crosses the x-axis (i.e., a zero of the tangent line)
is taken as the new estimate of the solution.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 58 / 60
Fundamentals of Optimization
Newton’s Method for Nonlinear Equations

Newton’s Method for Nonlinear Equations

Theorem (Convergence of Newton’s Method). Assume that the function f (x) has
two continuous derivatives. Let x∗ be a zero of f with f 0 (x∗ ) 6= 0. If |x0 − x∗ | is
sufficiently small, then the sequence defined by.

xk+1 = xk − f (xk )/f 0 (xk )

converges quadratically to x∗ with rate constant

f ”(x∗ )
C=| |
2f 0 (x∗ )

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 59 / 60
Fundamentals of Optimization
Newton’s Method for Nonlinear Equations

Systems of Nonlinear Equations

Much of the discussion in the one-dimensional case can be transferred with only
minor changes to the n-dimensional case.
Suppose that we are solving f (x) = 0, which represents,

f1 (x1 , ..., xn ) = 0
f2 (x1 , ..., xn ) = 0
...
...
fn (x1 , ..., xn ) = 0

The new estimate of the solution is then,

xk+1 = xk + p = xk − 5f (xk )−T f (xk )

where 5f (x) is the transpose of the Jacobian of f at the point x.

Dr. Sudhish N. George NIT Calicut EC6422, Linear and Nonlinear Optimization January 8, 2018 60 / 60

Você também pode gostar