Numerical Optimization

Numerical Optimization
Scientific Programming with Matlab WS 2015/16

apl. Prof. Dr. rer. nat. Frank Hoffmann
Univ.-Prof. Dr.-Ing. Prof. h.c. Torsten Bertram
Lehrstuhl fr Regelungssystemtechnik
Find the best solution for a given problem
Compute or approximate the solution parameter x from the set of
alternative solutions X which minimizes (maximizes) the objective function
F(x).
classification
regression
system identification
optimal control
min J ( ) = (ci , c( xi , ))
min =
J ( )
min=
J ( )
=
min
J (u (t ))
u (t )
x = f ( x, u )
( y y ( x , ))
i
( y(t ) y (t | )) dt
x '(t )Qx(t ) + u '(t ) Ru(t )dt
Optimization Methods for Problem Types

Problem is
nonlinear / linear
local / global
Order of known
derivatives
[Simplex method finds exact

solutions]
heuristic methods
Evolutionary
Algorithms
0: Linesearch
Ant colony
optimization
1: Gradient search
Simulated
annealing
Hill climbing
2: Newton method
Problem Classes in Optimization
Linear vs. Nonlinear optimization

Nonlinear local vs. global optimization
scalar or multiobjective problems
Unconstrained or constrained optimization
continuous or integer programming
known or unknown derivatives
Linear least squares regression

Data set
{(y 1 ,u 1 ), ,(y p ,u p )}
Modell is linear in parameters x and regressors u
y=
u i 1 x 1 + u i 2 x 2 + + u iq x q
i
Modell is linear in parameters x and nonlinear in the
regressors u
=
y i f 1 (u i 1 )x 1 + f 2 (u i 2 )x 2 + + f q (u iq )x q
Polynomial Approximation
2
y i = x 0 + ui x1 + ui x 2 + + ui
q 1
xq
Least Squares Solution
System of p linear equations in q unknowns:
y1
u 11 x 1 + u 12 x 2 + + u 1q x q =
u x + u x + + u x =
y2
21 1
22 2
2q q
u p 1 x 1 + u p 2 x 2 + + u pq x q =
yp
with
u 11 u 12
u
21 u 22

u p 1 u p 2
u 1q
u 2q
, x
=

u pq
x =
y
x1
x
2
=
und y

xq
y1
y
2 .

y p
Quadratic Cost Function
Linear Least Squares
For p < q solutions form a (q p)-dimensional subspace of q .

For p = q there is a unique solution (in general).
For p > q the system is overconstrained with no exact solution.
In the overconstrained case p > q find a solution vector x, which
minimizes the squared equation errors.
def
E =
(u i
i
=1
Vector representation
x 1 + + u iq x q y i ) = Ux y .
2
def
E =
e e, e =
Ux y.
Least Squares Solution

Overconstrained system of p>q linear equations in q unknowns x
y1
u 11 x 1 + u 12 x 2 + + u 1q x q =
u x + u x + + u x =
y2
21 1
22 2
2q q
u p 1 x 1 + u p 2 x 2 + + u pq x q =
yp
y
Ux =
Least squares solution=

x U=
y argmin Ux y
*
Pseudo inverse
U = UT U
* def
-1
UT
No need to compute U* explicitly, instead use singular value

decomposition.
Regression or Curve Fitting

Curve fitting is the process of constructing a curve, or
mathematical function, that has the best fit to a series of data points
http://en.wikipedia.org/wiki/Curve_fitting#mediaviewer/File:Regression_pic_assymetrique.gif
Regression
Regression analysis is a statistic method in data analysis
Objective: Describe the relationship between a dependent variable y
and one or multiple independent variables x
=
y f ( x) + e
=
y f ( x1 , , xn ) + e
e denotes the error or residual of the model f(x)
Quantitative description of relationships
Predict values of the independent variable y on the basis of known
values of x
Analysis of the significance of the relationship
13
Example Linear Regression
http://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Linear_regression.svg/1000px-Linear_regression.svg.png
14
Regression
y = w2 x 2 + w1 x + w0
=
y w1 x + w0
http://de.wikipedia.org/wiki/Ausgleichungsrechnung#mediaviewer/File:Liniendiagramm_Ausgleich.svg
15
Nonlinear Optimization of F(x)

Goal: minimize scalar function F(x) over parameter vector x
x* = arg min F ( x)
x
Begin
End
Nonlinear Local Optimization Methods

Derivative free methods
- line search
- secant method
- downhill-simplex-method
Methods based on first derivative

- Gradient descent and conjugate gradients
- quasi-Newton-methods (BFGS, Gauss-Newton,LevenbergMarquardt)
Methods based on second derivative

- Newton-method, Newton-Raphson-method.
- Folded Spectrum Method
Simplex Search (Nelder Mead)

Simplex: A special polytope of N + 1 vertices in N dimensions.
Examples of simplices include a line segment on a line, a triangle
on a plane, a tetrahedron in three-dimensional space and so
forth.
Generate a new test position by extrapolating the behavior of the
objective function measured at each test point arranged as a
simplex.
Replace the worst test point with the new test point
Replace the worst point with a point reflected through the
centroid of the remaining N points.
Nelder Mead Algorithm
http://capsis.cirad.fr/capsis/_media/documentation/neldermeadsteps.gif
http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method#mediaviewer/File:Nelder_Mead1.gif
http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method#mediaviewer/File:Nelder_Mead2.gif
Nonlinear Optimization
Necessary and sufficient conditions for a minimum
Taylor approximation
1
F ( x + x) F ( x) + xT g ( x) + xT G ( x)x + ...
2
Necessary condition:
Sufficient condition
g ( x*) = 0
1 T
F ( x + x) F ( x) + x G ( x)x + ...
2
xT G ( x)x > 0 G(x*)>0
iterative algorithm
- Initial parameter x0 xk
- Search direction pk
xk + k pk
- determine x k +=
1
Open issues
- How to determine p k ?
- How to determine k ?
- How to determine initial parameter x0 dependency of final solution?
Seach direction
- Taylor expansion of F(x) at the current solution xk .
F
T
(x k +1 x k ) =Fk + g k ( k p k )
x
T
g k p k < 0 p k =g k
Fk +1 =F (x k + k p k ) F (x k ) +
- Gradient descent
- First order gradient descent
x k +=
xk k g k
1
Gradient Descent
F(xk) decreases fastest if one goes from xk in the direction of the
negative gradient F (x k ) of F at xk.
If the step size is small enough
x k +1= x k F (x k )
then F (x k +1 ) F (x k )
Starts with a guess x0 for a local minimum of F(x), and
considers the sequence x 0 , x1 , x 2 ,
x k +1 = x k k F (x k )
Hopefully the sequence converges to the desired local minimum.
The value of the step size is allowed to change at every
iteration.
Gradient Descent
Gradient descent constant step size
x k +1= x k F (x k )
http://en.wikipedia.org/wiki/Gradient_descent#mediaviewer/File:Gradient_descent.svg
Gradient Descent
Gradient descent
constant step size
f ( x1 , x2 ) =
(1 x1 ) 2 + 100( x2 x1 ) 2
x k +1= x k F (x k )
http://en.wikipedia.org/wiki/Gradient_descent#mediaviewer/File:Banana-SteepDesc.gif
Gradient Descent
1 2 1 2
Gradient descent f ( =
x1 , x2 ) sin( x1 x2 + 3) + cos(2 x1 + 1 e x2 )
2
4
constant step size
x k +1= x k F (x k )
http://en.wikipedia.org/wiki/Gradient_descent#mediaviewer/File:Gradient_ascent_%28contour%29.png
Line search: determine step width?
x k +=
xk + k pk
1
Select k to minimize =
Fk +1 F (x k + k p k ) .
1
0
1
x0 =
p0 x1 =x 0 + p 0 =
=
1
2
1 + 2
F =1 + (1 + 2 ) + (1 + 2 ) 2
F
=2 + 2(1 + 2 )2 =0
1
3
* = , x1 = 1

4
2
F ( x1 , x2 ) =x + x1 x2 + x2
2
1
Line Search
Search along a line until the local
minimum is bracketed by search
points
Tighten the bracket by
- golden cut
- Half-half cut
- Polynomial approximation
polynomial approximation
- Approximate f(x) by a quadratic
or cubic func
- Take minimum as next point
- Might diverge
- Efficient close to minimum
Bisection Method
Identification of zeros
Optimization : zeros of first derivative
Bisection of interval constitutes next candidate solution
Secant Method (Line Search)
Second Order Methods

F
x
x1
1
T
=
xk =

, g k =
x
xn
F
xn
Faster convergence
- assumption: F is quadratic
and Taylor expansion of
gradient g k +1 at point x
k +1
- For xk +1 to become a
minimum
1
g k +1 = 0, p k = G k g k
g k +1 =g (x k + p k ) =g k + G k (x k +1 x k )
= gk + G k pk
1
x k +1 = x k + p k = x k G k g (x k )
1
x k +=
x k G k g(x k )
1
Check numerical condition of
Gk
2 F
2 F
2
x
x
1
n
1
Gk =
2
F F
2
x x
x
n
n 1
Gradient Descent vs. Newton-Method

Gradient descent
follows blindly direction of steepest descent
Newton Method
considers curvature
local second order approximation of F(x)
(Hessian)
Quasi-Newton-Methods (DFP, BFGS)

indirekt estimation of Hessian
Levenberg-Marquardt
combination of Newton-method and gradient
descent depending on numerical condition of
the Hessian
Nonlinear Optimization in Matlab

lsqlin : least squares method for (constrained) linear problems
min(Cx d )
Ax b
Aeq x = beq
xmin x xmax
quadprog : quadratic programming for (constrained) quadratic programs
1
min x ' Hx + f ' x
x
2
lsqnonlin : least squares method for nonlinear problems
Ax b
Aeq x = beq
xmin x xmax
min fi ( x) 2
x
Lsqcurvefit : least squares method for regression problems (xdata,ydata)
min ( f ( x, xdatai ) ydatai ) 2

x

fminunc : unconstrained nonlinear optimization
min f ( x)
x
fminsearch : Simplex-Method, no gradient information
min f ( x)
x
fmincon : constrained nonlinear optimization
min f ( x)
x
c( x) 0
ceq ( x) = 0
Ax b
Aeq x = beq
xmin x xmax
optimoptions : selection of optimization method and parameters

optimtool : graphical user interface
OPTIMTOOL
LSQLIN
>> C = [0.9501 0.7620 0.6153 0.4057
0.2311 0.4564 0.7919 0.9354
0.6068 0.0185 0.9218 0.9169
0.4859 0.8214 0.7382 0.4102
0.8912 0.4447 0.1762 0.8936];
>> d = [0.0578 0.3528 0.8131 0.0098 0.1388];
>> A =[0.2027 0.2721 0.7467 0.4659
0.1987 0.1988 0.4450 0.4186
0.6037 0.0152 0.9318 0.8462];
>> b =[0.5251 0.2026 0.6721];
>> Aeq = [3 5 7 9];
>> beq = 4;
>> lb = -0.1*ones(4,1);
>> ub = 2*ones(4,1);
>> x = lsqlin(C,d,A,b,Aeq,beq,lb,ub)
>> x = -0.1000 -0.1000 0.1599 0.4090
min(Cx d ) 2
x
Ax b
Aeq x = beq
lb x ub
QUADPROG
>> H = [1 -1; -1 2];
>> f = [-2; -6];
>> A = [1 1; -1 2; 2 1];
>> b = [2; 2; 3];
>> lb = zeros(2,1);
>> options = optimoptions('quadprog',...
'Algorithm','interior-point-convex','Display','off');
1
min x ' Hx + f ' x
x
2
Ax b
Aeq x = beq
lb x ub
>> [x,fval,exitflag,output,lambda] = quadprog(H,f,A,b,[],[],lb,[],[],options);

>> x,fval,exitflag
x = 0.6667 1.3333
fval = -8.2222
exitflag = 1
LSQNONLIN
>> d = linspace(0,3);
>> y = exp(-1.3*d) + 0.05*randn(size(d));
>> fun = @(r)exp(-d*r)-y;
>> x0 = 4;
>> x = lsqnonlin(fun,x0)
Local minimum possible.
lsqnonlin stopped because the final
change in the sum of squares relative to
its initial value is less than the default
value of the function tolerance.
x = 1.2645
>> plot(d,y,'ko',d,exp(-x*d),'b-');
min fi ( x) 2
x
LSQCURVEFIT
>> xdata = [0.9 1.5 13.8 19.8 24.1 28.2 35.2
60.3 74.6 81.3];
>> ydata = [455.2 428.6 124.1 67.3 43.2 28.1
13.1 -0.4 -1.3 -1.5];
>> fun = @(x,xdata)x(1)*exp(x(2)*xdata);
>> x0 = [100,-1];
>> x = lsqcurvefit(fun,x0,xdata,ydata)
Local minimum possible.
lsqcurvefit stopped
x = 498.8309 -0.1013
>> times = linspace(xdata(1),xdata(end));
>> plot(xdata,ydata,'ko',times,fun(x,times),'b-')
min ( f ( x, xdatai ) ydatai ) 2

x
FMINUNC
>> fun = @(x)x(1)*exp(-(x(1)^2 + x(2)^2)) + (x(1)^2 + x(2)^2)/20;
min f ( x)
x
>> x0 = [1,2];
2
2
>> [x,fval] = fminunc(fun,x0)
2
2
(
x
+
x
( x1 + x2 )
1
2 )
(
,
)
=
f
x
x
x
e
+
x = -0.6691 0.0000
1
2
1
20
fval = -0.4052
>> options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
>> [x,fval,exitflag,output] = fminunc(fun,x0,options)
Iteration Func-count
f(x)
Step-size
0
3
0.256738
1
6
0.222149
1
2
9
0.15717
1
3
18
-0.227902
0.438133
4
21
-0.299271
1
5
30
-0.404028
0.102071
6
33
-0.404868
1
7
36
-0.405236
1
8
39
-0.405237
1
9
42
-0.405237
1
first order optimality

0.173
0.131
0.158
0.386
0.46
0.0458
0.0296
0.00119
0.000252
7.97e-07
FMINUNC
function [f,g] = rosenbrockwithgrad(x)
% Calculate objective f
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;
if nargout > 1 % gradient required
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
end
>> options =
optimoptions('fminunc','Algorithm','trustregion','GradObj','on');
>> x0 = [-1,2];
>> fun = @rosenbrockwithgrad;
>> x = fminunc(fun,x0,options)
f ( x1 ,=
x2 ) 100( x1 x2 ) + (1 x1 ) 2
2
x opt = [1,1]
FMINCON
Ax b
>> fun = @(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2;

>> x0 = [0.5,0];
>> A = [1,2];
>> b = 1;
>> Aeq = [2,1];
>> beq = 1;
>> x = fmincon(fun,x0,A,b,Aeq,beq)
x = 0.4149 0.1701
Aeq x = beq
min f ( x)
x
lb x ub
f ( x1 ,=
x2 ) 100( x1 x2 ) + (1 x1 ) 2
2
x1 + 2 x2 1
2 x1 + x2 =
1

Final solution depends on initial
solution x0
- convergence to local minima
- multiple restart to obtain
consistent solutions
- global heuristic methods such
as evolutionary algorithms
Optimization Toolbox Demos

datdemo.m
y = c(1)*exp(-lam(1)*t) + c(2)*exp(-lam(2)*t)
Optimization Toolbox Demos

bandem.m
Next: Global Optimization

Scientific Programming with Matlab WS 2014/15
apl. Prof. Dr. rer. nat. Frank Hoffmann
Univ.-Prof. Dr.-Ing. Prof. h.c. Torsten Bertram
Institute of Control Theory and Systems Engineering

Numerical Optimization

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Numerical Optimization

Enviado por

Direitos autorais:

Formatos disponíveis

Numerical Optimization

Scientific Programming with Matlab WS 2015/16

Optimization Methods for Problem Types

[Simplex method finds exact

Problem Classes in Optimization

Linear vs. Nonlinear optimization

Linear least squares regression

Modell is linear in parameters x and regressors u

Least Squares Solution

System of p linear equations in q unknowns:

Quadratic Cost Function

Linear Least Squares

For p < q solutions form a (q p)-dimensional subspace of q .

Least Squares Solution

Least squares solution=

No need to compute U* explicitly, instead use singular value

Regression or Curve Fitting

Example Linear Regression

Nonlinear Optimization of F(x)

Nonlinear Local Optimization Methods

Methods based on first derivative

Methods based on second derivative

Simplex Search (Nelder Mead)

Nelder Mead Algorithm

Nelder Mead Algorithm

Nelder Mead Algorithm

- First order gradient descent

Secant Method (Line Search)

Second Order Methods

Gradient Descent vs. Newton-Method

Quasi-Newton-Methods (DFP, BFGS)

Nonlinear Optimization in Matlab

quadprog : quadratic programming for (constrained) quadratic programs

Lsqcurvefit : least squares method for regression problems (xdata,ydata)

min ( f ( x, xdatai ) ydatai ) 2

Nonlinear Optimization in Matlab

fminsearch : Simplex-Method, no gradient information

fmincon : constrained nonlinear optimization

optimoptions : selection of optimization method and parameters

>> [x,fval,exitflag,output,lambda] = quadprog(H,f,A,b,[],[],lb,[],[],options);

min ( f ( x, xdatai ) ydatai ) 2

first order optimality

>> fun = @(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2;

Nonlinear Optimization in Matlab

Optimization Toolbox Demos

Optimization Toolbox Demos

Next: Global Optimization

Você também pode gostar