Escolar Documentos
Profissional Documentos
Cultura Documentos
Numerical Optimization
Find the best solution for a given problem
Compute or approximate the solution parameter x from the set of
alternative solutions X which minimizes (maximizes) the objective function
F(x).
Numerical Optimization
classification
regression
system identification
optimal control
min J ( ) = (ci , c( xi , ))
min =
J ( )
min=
J ( )
=
min
J (u (t ))
u (t )
x = f ( x, u )
( y y ( x , ))
i
( y(t ) y (t | )) dt
x '(t )Qx(t ) + u '(t ) Ru(t )dt
heuristic methods
Evolutionary
Algorithms
0: Linesearch
Ant colony
optimization
1: Gradient search
Simulated
annealing
Hill climbing
2: Newton method
{(y 1 ,u 1 ), ,(y p ,u p )}
y=
u i 1 x 1 + u i 2 x 2 + + u iq x q
i
Modell is linear in parameters x and nonlinear in the
regressors u
=
y i f 1 (u i 1 )x 1 + f 2 (u i 2 )x 2 + + f q (u iq )x q
Polynomial Approximation
2
y i = x 0 + ui x1 + ui x 2 + + ui
q 1
xq
y1
u 11 x 1 + u 12 x 2 + + u 1q x q =
u x + u x + + u x =
y2
21 1
22 2
2q q
u p 1 x 1 + u p 2 x 2 + + u pq x q =
yp
with
u 11 u 12
u
21 u 22
u p 1 u p 2
u 1q
u 2q
, x
=
u pq
x =
y
x1
x
2
=
und y
xq
y1
y
2 .
y p
E =
(u i
i
=1
Vector representation
x 1 + + u iq x q y i ) = Ux y .
2
def
E =
e e, e =
Ux y.
y1
u 11 x 1 + u 12 x 2 + + u 1q x q =
u x + u x + + u x =
y2
21 1
22 2
2q q
u p 1 x 1 + u p 2 x 2 + + u pq x q =
yp
y
Ux =
Pseudo inverse
U = UT U
* def
-1
UT
http://en.wikipedia.org/wiki/Curve_fitting#mediaviewer/File:Regression_pic_assymetrique.gif
Regression
Regression analysis is a statistic method in data analysis
Objective: Describe the relationship between a dependent variable y
and one or multiple independent variables x
=
y f ( x) + e
=
y f ( x1 , , xn ) + e
e denotes the error or residual of the model f(x)
Quantitative description of relationships
Predict values of the independent variable y on the basis of known
values of x
Analysis of the significance of the relationship
13
http://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Linear_regression.svg/1000px-Linear_regression.svg.png
14
Regression
y = w2 x 2 + w1 x + w0
=
y w1 x + w0
http://de.wikipedia.org/wiki/Ausgleichungsrechnung#mediaviewer/File:Liniendiagramm_Ausgleich.svg
15
x* = arg min F ( x)
x
Begin
End
http://capsis.cirad.fr/capsis/_media/documentation/neldermeadsteps.gif
http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method#mediaviewer/File:Nelder_Mead1.gif
http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method#mediaviewer/File:Nelder_Mead2.gif
Nonlinear Optimization
Necessary and sufficient conditions for a minimum
Taylor approximation
1
F ( x + x) F ( x) + xT g ( x) + xT G ( x)x + ...
2
Necessary condition:
Sufficient condition
g ( x*) = 0
1 T
F ( x + x) F ( x) + x G ( x)x + ...
2
xT G ( x)x > 0 G(x*)>0
Nonlinear Optimization
iterative algorithm
- Initial parameter x0 xk
- Search direction pk
xk + k pk
- determine x k +=
1
Open issues
- How to determine p k ?
- How to determine k ?
- How to determine initial parameter x0 dependency of final solution?
Seach direction
- Taylor expansion of F(x) at the current solution xk .
F
T
(x k +1 x k ) =Fk + g k ( k p k )
x
T
g k p k < 0 p k =g k
Fk +1 =F (x k + k p k ) F (x k ) +
- Gradient descent
x k +=
xk k g k
1
Gradient Descent
F(xk) decreases fastest if one goes from xk in the direction of the
negative gradient F (x k ) of F at xk.
If the step size is small enough
x k +1= x k F (x k )
then F (x k +1 ) F (x k )
Starts with a guess x0 for a local minimum of F(x), and
considers the sequence x 0 , x1 , x 2 ,
x k +1 = x k k F (x k )
Hopefully the sequence converges to the desired local minimum.
The value of the step size is allowed to change at every
iteration.
Gradient Descent
Gradient descent constant step size
x k +1= x k F (x k )
http://en.wikipedia.org/wiki/Gradient_descent#mediaviewer/File:Gradient_descent.svg
Gradient Descent
Gradient descent
constant step size
f ( x1 , x2 ) =
(1 x1 ) 2 + 100( x2 x1 ) 2
x k +1= x k F (x k )
http://en.wikipedia.org/wiki/Gradient_descent#mediaviewer/File:Banana-SteepDesc.gif
Gradient Descent
1 2 1 2
Gradient descent f ( =
x1 , x2 ) sin( x1 x2 + 3) + cos(2 x1 + 1 e x2 )
2
4
constant step size
x k +1= x k F (x k )
http://en.wikipedia.org/wiki/Gradient_descent#mediaviewer/File:Gradient_ascent_%28contour%29.png
Nonlinear Optimization
Line search: determine step width?
x k +=
xk + k pk
1
Select k to minimize =
Fk +1 F (x k + k p k ) .
1
0
1
x0 =
p0 x1 =x 0 + p 0 =
=
1
2
1 + 2
F =1 + (1 + 2 ) + (1 + 2 ) 2
F
=2 + 2(1 + 2 )2 =0
1
3
* = , x1 = 1
4
2
F ( x1 , x2 ) =x + x1 x2 + x2
2
1
Line Search
Search along a line until the local
minimum is bracketed by search
points
Tighten the bracket by
- golden cut
- Half-half cut
- Polynomial approximation
polynomial approximation
- Approximate f(x) by a quadratic
or cubic func
- Take minimum as next point
- Might diverge
- Efficient close to minimum
Bisection Method
Identification of zeros
Optimization : zeros of first derivative
Bisection of interval constitutes next candidate solution
=
xk =
, g k =
x
xn
F
xn
Faster convergence
- assumption: F is quadratic
and Taylor expansion of
gradient g k +1 at point x
k +1
- For xk +1 to become a
minimum
1
g k +1 = 0, p k = G k g k
g k +1 =g (x k + p k ) =g k + G k (x k +1 x k )
= gk + G k pk
1
x k +1 = x k + p k = x k G k g (x k )
1
x k +=
x k G k g(x k )
1
Check numerical condition of
Gk
2 F
2 F
2
x
x
1
n
1
Gk =
2
F F
2
x x
x
n
n 1
min(Cx d )
Ax b
Aeq x = beq
xmin x xmax
1
min x ' Hx + f ' x
x
2
lsqnonlin : least squares method for nonlinear problems
Ax b
Aeq x = beq
xmin x xmax
min fi ( x) 2
x
min f ( x)
x
min f ( x)
x
min f ( x)
x
c( x) 0
ceq ( x) = 0
Ax b
Aeq x = beq
xmin x xmax
OPTIMTOOL
LSQLIN
>> C = [0.9501 0.7620 0.6153 0.4057
0.2311 0.4564 0.7919 0.9354
0.6068 0.0185 0.9218 0.9169
0.4859 0.8214 0.7382 0.4102
0.8912 0.4447 0.1762 0.8936];
>> d = [0.0578 0.3528 0.8131 0.0098 0.1388];
>> A =[0.2027 0.2721 0.7467 0.4659
0.1987 0.1988 0.4450 0.4186
0.6037 0.0152 0.9318 0.8462];
>> b =[0.5251 0.2026 0.6721];
>> Aeq = [3 5 7 9];
>> beq = 4;
>> lb = -0.1*ones(4,1);
>> ub = 2*ones(4,1);
>> x = lsqlin(C,d,A,b,Aeq,beq,lb,ub)
>> x = -0.1000 -0.1000 0.1599 0.4090
min(Cx d ) 2
x
Ax b
Aeq x = beq
lb x ub
QUADPROG
>> H = [1 -1; -1 2];
>> f = [-2; -6];
>> A = [1 1; -1 2; 2 1];
>> b = [2; 2; 3];
>> lb = zeros(2,1);
>> options = optimoptions('quadprog',...
'Algorithm','interior-point-convex','Display','off');
1
min x ' Hx + f ' x
x
2
Ax b
Aeq x = beq
lb x ub
LSQNONLIN
>> d = linspace(0,3);
>> y = exp(-1.3*d) + 0.05*randn(size(d));
>> fun = @(r)exp(-d*r)-y;
>> x0 = 4;
>> x = lsqnonlin(fun,x0)
Local minimum possible.
lsqnonlin stopped because the final
change in the sum of squares relative to
its initial value is less than the default
value of the function tolerance.
x = 1.2645
>> plot(d,y,'ko',d,exp(-x*d),'b-');
min fi ( x) 2
x
LSQCURVEFIT
>> xdata = [0.9 1.5 13.8 19.8 24.1 28.2 35.2
60.3 74.6 81.3];
>> ydata = [455.2 428.6 124.1 67.3 43.2 28.1
13.1 -0.4 -1.3 -1.5];
>> fun = @(x,xdata)x(1)*exp(x(2)*xdata);
>> x0 = [100,-1];
>> x = lsqcurvefit(fun,x0,xdata,ydata)
Local minimum possible.
lsqcurvefit stopped
x = 498.8309 -0.1013
>> times = linspace(xdata(1),xdata(end));
>> plot(xdata,ydata,'ko',times,fun(x,times),'b-')
FMINUNC
>> fun = @(x)x(1)*exp(-(x(1)^2 + x(2)^2)) + (x(1)^2 + x(2)^2)/20;
min f ( x)
x
>> x0 = [1,2];
2
2
>> [x,fval] = fminunc(fun,x0)
2
2
(
x
+
x
( x1 + x2 )
1
2 )
(
,
)
=
f
x
x
x
e
+
x = -0.6691 0.0000
1
2
1
20
fval = -0.4052
>> options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
>> [x,fval,exitflag,output] = fminunc(fun,x0,options)
Iteration Func-count
f(x)
Step-size
0
3
0.256738
1
6
0.222149
1
2
9
0.15717
1
3
18
-0.227902
0.438133
4
21
-0.299271
1
5
30
-0.404028
0.102071
6
33
-0.404868
1
7
36
-0.405236
1
8
39
-0.405237
1
9
42
-0.405237
1
FMINUNC
function [f,g] = rosenbrockwithgrad(x)
% Calculate objective f
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;
if nargout > 1 % gradient required
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
end
>> options =
optimoptions('fminunc','Algorithm','trustregion','GradObj','on');
>> x0 = [-1,2];
>> fun = @rosenbrockwithgrad;
>> x = fminunc(fun,x0,options)
f ( x1 ,=
x2 ) 100( x1 x2 ) + (1 x1 ) 2
2
x opt = [1,1]
FMINCON
Ax b
Aeq x = beq
min f ( x)
x
lb x ub
f ( x1 ,=
x2 ) 100( x1 x2 ) + (1 x1 ) 2
2
x1 + 2 x2 1
2 x1 + x2 =
1
y = c(1)*exp(-lam(1)*t) + c(2)*exp(-lam(2)*t)