Escolar Documentos
Profissional Documentos
Cultura Documentos
Outline:
• Approximate TV de-noising
cor
Pn−1
minimize kx − x k2 + µ i=1 |xi+1 − xi|
• Approximate TV denoising
where
X p
n−1
φatv(x) = 2 + (xi+1 − xi)2 −
i=1
n−1
X
F (u1, . . . , un−1) = f (ui)
i=1
• Therefore
• Hessian is tridiagonal
% Newton’s method
ALPHA = 0.01;
BETA = 0.5;
MAXITERS = 100;
NTTOL = 1e-10;
x = zeros(n,1);
newt_dec = [];
v = -hess\grad;
lambdasqr = -grad’*v;
newt_dec = [newt_dec sqrt(lambdasqr)];
t = 1;
while ((x+t*v-xcor)’*(x+t*v-xcor) + ...
MU*sum(sqrt(EPSILON^2+(D*(x+t*v)).^2)-...
EPSILON*ones(n-1,1)) > val - ALPHA*t*lam
t = BETA*t;
end;
x = x+t*v;
end;
1
10
Newton decrement λ(x)
0
10
−1
10
−2
10
−3
10
−4
10
Sfrag replacements
−5
10
0 2 4 6 8 10 12
−1
Sfrag replacements−2
−3
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5
3
xtikh
xcor
−1
Sfrag replacements
−2
−3
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5
t
3
xtikh
xcor
−1
Sfrag replacements
−2
−3
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5
t
• Since AR = U SV T ,
V T RT (P + AT A)RV = V T RT P RV + S T S = I
minimize f0(x)
subject to fi(x) ≤ 0, i = 1, . . . , m
Ax = b
• Approximation improves as t → ∞
Pm
• Logarithmic barrier function: φ(x) = − i=1 log(−fi(x)), w
dom φ = {x | f1(x) < 0, . . . , fm(x) < 0}
b = − log(x − 2) − log(4 − x)
• Log-barrier function I(x)
60
50
40
30
20
PSfrag replacements
10
0
1 2 3 4 5
x
b for t = 10−1, 10−0.8, 10−0.6, . . . , 10
Figure 1: f0 + (1/t)I,
m
X 1
∇φ(x) = ∇fi(x),
i=1
−f i (x)
m
X X 1 m
1
∇2φ(x) = 2
T
∇fi(x)∇fi(x) + ∇
f
i=1 i
(x) i=1
−f i (x)
Administrative info:
• office hours: Tue 5:30 - 7:30pm, Wed 4:00 - 8:00pm, Packard 106
• newsgroup: su.class.ee364
1
Important sets
• subspace
• affine set
• convex set
• cones
y = θ1x1 + · · · + θk xk is a
• intersection
subspace subspace
affine \ affine
Sα is
convex
for α ∈ A =⇒
Sα is
convex
α∈A
convex cone convex cone
K ? = { y | xT y ≥ 0 for all x ∈ K }
PSfrag replacements K?
90◦
90◦
C = {x ∈ Rn | xT Ax + bT x + c ≤ 0},
1
Operations that preserve convexity
• Convex sets and convex functions are related via the epigraph.
Solution:
Pr
1. Show that f (x) = i=1 αix[i] is a convex function of x, where
α1 ≥ α2 ≥ · · · ≥ αr ≥ 0, and x[i] denotes the ith largest component of
Pk
x. (You can use the fact that f (x) = i=1 x[i] is convex on Rn.)
Solution:
1. We can express f as
is convex in x.
aT
1. f (x) = − log(− log( m i x+bi )) on
P
i=1 e
Pm aT x+b
domPf = {x | i=1 e
i i < 1}. You can use the fact that
n
log( i=1 eyi ) is convex.
√
2. f (x, u, v) = − uv − xT x on
dom f = {(x, u, v) | uv > xT x, u, v > 0}. Use the fact that xT x/u is
√
convex in (x, u) for u > 0, and that − x1x2 is convex on R2++.
3. f (x, u, v) = − log(uv − xT x) on
dom f = {(x, u, v) | uv > xT x, u, v > 0}.
aT
1. g(x) = log( m i x+bi ) is convex (composition of the log-sum-exp
P
i=1 e
function and an affine mapping), so −g is concave. The function
h(y) = − log y is convex and decreasing. Therefore f (x) = h(−g(x)) is
convex.
p
2. We can express f as f (x, u, v) = − u(v − xT x/u). The function
√
h(x1, x2) = − x1x2 is convex on R2++, and decreasing in each
argument. The functions g1(u, v, x) = u and g2(u, v, x) = v − xT x/u
are concave. Therefore f (u, v, x) = h(g(u, v, x)) is convex.
3. We can express f as
Solution:
−1 T
= tr Z Q(I + tΛ) Q
−1
T −1
= tr Q Z Q(I + tΛ) −1
n
X
= (QT Z −1Q)ii(1 + tλi)−1,
i=1
• Can solve any problem that obeys the disciplined convex programming
ruleset
http://www.stanford.edu/~boyd/cvx/
cvx_begin
constraints
cvx_end
cvx_begin
variable x(n)
minimize(norm(A*x-b))
cvx_end
cvx_begin
variable x(n)
minimize(ones(1,n)*x)
subject to
A*x == b
x >= 0
cvx_end
cvx_begin
variable x(n)
minimize(max(max([A*x inv_pos(A*x)]’)))
subject to
x >= 0
x <= 1
cvx_end
EE364 §3
Outline
• Generalized eigenvalues
• Hyperbolic constraints
• Homework hints
EE364 §3 1
Generalized eigenvalues
uT Xu
λmax(X, Y ) = sup T , dom f = Sn × Sn++
u6=0 u Y u
EE364 §3 2
Hyperbolic constraints as SOC constraints.
Problem 4.26
xT x ≤ yz, y ≥ 0, z≥0
if and only if
2x
≤ y + z,
y−z
y ≥ 0, z≥0
2
with x ∈ Rn, y, z ∈ R
EE364 §3 3
Proof
2x
y − z
≤ y + z, y ≥ 0, z ≥ 0 ⇐⇒
2
2
2x
≤ (y + z)2,
y−z
y ≥ 0, z ≥ 0 ⇐⇒
2
EE364 §3 4
Maximizing harmonic mean
Pm T
−1
maximize i=1 1/(ai x + bi ) ,
EE364 §3 5
the problem is equivalent to
minimize 1T t
subject to ti(aTi x + bi) ≥ 1, i = 1, . . . , m
t0
T
minimize
t
1
2
≤ a T x + b i + ti ,
subject to
T i i = 1, . . . , m
a i x + b i − t i
2
ti ≥ 0, aTi x + bi ≥ 0, i = 1, . . . , m
EE364 §3 6
Maximizing geometric mean
Qm T
1/m
maximize i=1 (ai x − bi ) ,
EE364 §3 7
consider m = 4 as an example
the problem is equivalent to
maximize y1y2y3y4
subject to y = Ax − b
y 0,
and
maximize t1t2
subject to y = Ax − b
y1y2 ≥ t21
y3y4 ≥ t22
y 0, t1 ≥ 0, t2 ≥ 0,
and also
maximize t
subject to y = Ax − b
y1y2 ≥ t21
y3y4 ≥ t22
t1 t2 ≥ t 2
y 0, t1, t2, t ≥ 0
EE364 §3 8
expressing the three hyperbolic constraints
minimize −t
2t1
≤ y1 + y2, y1 ≥ 0, y2 ≥ 0
subject to
y 1 − y 2
2
2t2
y3 − y4
≤ y3 + y4, y3 ≥ 0, y4 ≥ 0
2
2t
t1 − t2
≤ t1 + t2, t1 ≥ 0, t2 ≥ 0
2
y = Ax − b
EE364 §3 9
General case
EE364 §3 10
Problem 3.49 log-concavity
(c) to show Qn
xi
i=1
f (x) = Pn , domf = Rn++
x
i=1 i
is log-concave
we need to show
n
X n
X
g(x) = log f (x) = log xi − log xi
i=1 i=1
is concave on Rn++
EE364 §3 11
Problem 3.49 (c) continued...
partial derivatives:
∂g(x) 1 1
= − Pn ,
∂xi xi x
i=1 i
and
∂ 2g(x) 1 1
= − 2 + Pn 2,
∂x2i xi (
i=1 xi )
∂ 2g(x) 1
= Pn 2, i 6= j
∂xi∂xj ( i=1 xi )
therefore
T
11 1 1
∇2g(x) = Pn 2 − diag , . . . ,
( i=1 xi) x21 x2n
EE364 §3 12
Problem 3.49 (c) continued...
to show uT ∇2g(x)u ≤ 0, for all u ∈ Rn, i.e.,
!
T
T 1 1 11
u diag ,..., 2 − Pn 2 u ≥ 0,
x21 xn ( i=1 xi)
same as n Pn 2
X u2 i ( i=1 ui )
≥ 2
x2i
Pn
i=1 ( i=1 xi)
n
!2 n
! n
!
X X u2i X
ui ≤ 2 x2i
i=1
x
i=1 i i=1
n
! n
!2
X u2i X
≤ xi
i=1
x2i i=1
EE364 §3 13
Problem 3.49 (d)
EE364 §3 14
Problem 4.8
minimize cT x
subject to Ax = b
EE364 §3 15
Problem 4.8 continued...
minimize cT x
subject to l x u,
EE364 §3 16
the optimal x?i minimizes cixi subject to li ≤ xi ≤ ui
• if ci > 0, then x?i = li
• if ci < 0, then x?i = ui
• if ci = 0, then any xi in [li, ui] is optimal
the optimal value is
p? = l T c + + u T c − ,
where c+
i = max{c i , 0} and c −
i = max{−ci , 0}
EE364 §3 17
Conjugate function
y − xp−1 = 0
and
p/(p−1)
y
f ∗(y) = y p/(p−1) −
p
define q such that 1/p + 1/q = 1, then (with similar analysis for y ≤ 0),
f ∗(y) = |y|q /q
EE364 §3 18
Proof of Hölder’s inequality
xT y ≤ kxkpkykq
xT y ≤ f (x) + f ∗(y)
f (x) = kxkpp/p,
is
f ∗(y) = kykqq /q,
where 1/p + 1/q = 1
EE364 §3 19
Proof continued ...
thus
T
kxkpp kykqq
x y≤ +
p q
apply to x/kxkp, y/kykq ,
T
p
q
x y 1
x
1
y
=1+1=1
≤
+
kxkpkykq p kxkp
p q
kykq
q p q
therefore
xT y ≤ kxkpkykq
EE364 §3 20
EE364 Review
Outline:
• exercise 4.47
1
Convex optimization problems
we have seen
ui
θi
(pix, piy )
PSfrag replacements
Pn
• resulting vertical force: Fy = i=1 ui sin θi
Pn
• resulting torque: T = i=1 (piy ui cos θi − pixui sin θi)
• fuel usage: u1 + · · · + un
problem: find thruster forces ui that yield given desired forces and torques
Fxdes, Fydes, T des, and minimize fuel usage (if feasible)
minimize 1T u
subject to F u = f des
0 ≤ ui ≤ 1, i = 1, . . . , n
where
cos θ1 ··· cos θn
F = sin θ1 ··· sin θn ,
p1y cos θ1 − p1x sin θ1 · · · pny cos θn − pnx sin θn
% input data
% ----------
% thrusters x-coordinates
px = [-3 -2 -1 1.5 2 ];
% thrusters y-coordinates
py = [ 0 1 -2 1 -2.5];
% angles
thetas = [-85 30 -150 0 85]*pi/180;
F = [ cos(thetas);
sin(thetas);
py.*cos(thetas) - px.*sin(thetas)];
% different problem specified by each column of f_des
f_des = [ 0 0 1 -.5 0 0; ...
.5 -1 0 0 0 0; ...
0 0 0 0 2 -2];
thrus = [];
for i=1:6
cvx_begin
variable u(5)
minimize ( sum ( u ) )
F*u == f_des(:,i)
u >= 0
u <= 1
cvx_end
end
−1
−2
−3
−4
−4 −3 −2 −1 0 1 2 3 4
−1
−2
−3
−4
−4 −3 −2 −1 0 1 2 3 4
−1
−2
−3
−4
−4 −3 −2 −1 0 1 2 3 4
−1
−2
−3
−4
−4 −3 −2 −1 0 1 2 3 4
−1
−2
−3
−4
−4 −3 −2 −1 0 1 2 3 4
−1
−2
−3
−4
−4 −3 −2 −1 0 1 2 3 4
can express as LP
minimize kF u − f desk∞
subject to 0 ≤ ui ≤ 1, i = 1, . . . , n
can express as LP
minimize # thrusters on
subject to F u = f des
0 ≤ ui ≤ 1, i = 1, . . . , n
can’t express as LP
(but we could check feasibility of each of the 2n subsets of thrusters)
PSfrag replacements
f1
f2
f3
f4
• fundamental frequency:
1/2 1/2
ω1 = λmin (K, M ) = λmin(M −1/2KM −1/2 )
P
• structure weight w = w0 + i xi w i
as SDP: P
minimize w0 + i xiwi
subject to Ω2M (x) − K(x) 0
li ≤ x i ≤ u i
minimize f0(x)
subject to fi(x) ≤ 0, i = 1, . . . , m
Ax = b
fi convex, f0 quasiconvex
f0(x) ≤ t ⇔ φt(x) ≤ 0
We consider a matrix A ∈ Sn, with some entries specified, and the others
not specified. Say
3.0 0.5 0.25
0.5 2.0 0.75
A=
.
0.75 1.0
0.25 5.0
n = 4;
maximize( det_rootn( A ) )
A >= 0;
% constrained matrix entries.
A(1,1) == 3;
A(2,2) == 2;
A(3,3) == 1;
A(4,4) == 5;
A(1,2) == .5;
A(1,4) == .25;
A(2,3) == .75;
cvx_end
A =
3.0000 0.5000 0.1874 0.2500
0.5000 2.0000 0.7500 0.0417
0.1874 0.7500 1.0000 0.0156
0.2500 0.0417 0.0156 5.0000
eigs =
0.5964
2.0908
3.2773
5.0355
Outline:
• Duality examples
• Strong duality
• Farkas’ lemma
• Homework hints
1
Duality
• Primal problem
minimize f0(x)
subject to fi(x) ≤ 0, i = 1, . . . , m
hi(x) = 0, i = 1, . . . , p
Pm Pp
• Lagrangian L(x, λ, ν) = f0(x) + i=1 λi fi (x) + i=1 νi hi (x)
• For λ 0, g(λ, ν) ≤ p∗
• Dual problem
maximize g(λ, ν)
subject to λ 0
can be expressed as
maximize 4y1 + 2y2
subject to y1 ≤ 3
y1 + y2 ≤ 2
2y2 ≤ 1
y1, y2 ≥ 0.
minimize f T x
subject to kAx + bk2 ≤ cT x + d,
minimize f T x
subject to kyk2 ≤ t
Ax + b = y
cT x + d = t.
maximize −uT b − νd
subject to kuk2 ≤ ν
AT u + νc = f.
minimize e−x
subject to x2/y ≤ 0,
• Optimal value p? = 1
• Dual problem:
maximize 0
subject to λ ≥ 0,
• No strong duality only if both primal and dual are infeasible: p? = +∞,
d? = −∞
• Example
minimize x
0 −1
subject to x
1 1
• Dual LP
maximize z1 − z2
subject to z2 + 1 = 0
z1 , z 2 ≥ 0
Ax 0, cT x < 0, (1)
AT y + c = 0, y 0, (2)
• Consider the LP
minimize cT x
subject to Ax 0.
• Dual of this LP
maximize 0
subject to AT y + c = 0
y 0.
minimize t
subject to u 0, 1T u = 1
P T u t1,
1T λ = 1,
ν P λ − ν1 = µ
g(λ, µ, ν) =
−∞ otherwise
maximize ν
subject to λ 0, 1T λ = 1, µ0
P λ − ν1 = µ
maximize ν
subject to λ 0, 1T λ = 1
P λ ν1
Outline:
• SDP relaxation
• Homework hints
1
Variable bounds and dual feasibility
minimize f0(x)
subject to fi(x) ≤ 0, i = 1, . . . , m
li ≤ xi ≤ ui, i = 1, . . . , n
the Lagrangian is
m
X
L(x, λ, µ, ν) = f0(x) + λifi(x) + µT (x − u) + ν T (l − x)
i=1
we have
m
X
∇xL(x, λ, µ, ν) = ∇f0(x) + λi∇fi(x) + (µ − ν)
i=1
m
X
ν − µ = ∇f0(x) + λi∇fi(x)
i=1
and
" m
#−
X
µ = ∇f0(x) + λi∇fi(x)
i=1
m m
!
1 X X
= |∇f0(x) + λi∇fi(x)| − ∇f0(x) − λi∇fi(x)
2 i=1 i=1
where | · | is componentwise
with x = (l + u)/2 and λ = 0 we can find a dual feasible point and a lower
bound on f ?
we have
1
ν = (∇f0((l + u)/2) + |∇f0((l + u)/2)|)
2
1
µ = (−∇f0((l + u)/2) + |∇f0((l + u)/2)|)
2
therefore,
inf lxu ∇f0((u + l)/2)T (x − (u + l)/2) = −|∇f0((u + l)/2)|T (u − l)/2
we get
f ? ≥ f0((u + l)/2) − |∇f0((u + l)/2)|T (u − l)/2
minimize xT W x
subject to x2i = 1, i = 1, . . . , n,
n
X
L(x, ν) = xT W x + νi(x2i − 1)
i=1
= xT (W + diag(ν))x − 1T ν
maximize −1T ν
subject to W + diag(ν) 0
the optimal value of the dual is a lower bound on the optimal value of the
partitioning problem
since
xT W x = tr(xT W x) = tr(W xxT )
and
(xxT )ii = x2i
we can write the original problem as
minimize tr(W X)
subject to X 0, rank X = 1
Xii = 1, i = 1, . . . , n,
minimize tr(W X)
subject to X 0,
Xii = 1, i = 1, . . . , n,
• this problem is convex (SDP) and gives a lower bound on the original
problem
minimize 1T ν
subject to W + diag(ν) 0
maximize − tr(W X)
subject to X 0
Xii = 1, i = 1, . . . , n
minimize f0(x)
subject to fi(x) ≤ 0, i = 1, . . . , m
m
X
f0(x) + λifi(x)
i=1
but
" m
# m
∂ X X
exp f0(x) + λ̃i∇fi(x̄) = exp f0(x̄)∇f0(x̄) + λ̃i∇fi(x̄)
∂x i=1 i=1
x=x̄
" m
#
X
= exp f0(x) ∇f0(x̄) + λ̃ie−f0(x̄)∇fi(x̄)
i=1
we have bound
p? ≥ g(λ),
where g is the dual function of the first problem and
exp p? ≥ g̃(λ̃)
p? ≥ log g̃(λ̃)
m
!
X
log g̃(λ̃) = log ef0(x̄) + ef0(x̄)λifi(x̄)
i=1
m
!
X
= f0(x̄) + log 1 + λifi(x̄)
i=1
m
! m
X X
log g̃(λ̃) − g(λ) = log 1 + λifi(x̄) − λifi(x̄)
i=1 i=1
variable t
square( x+y ) <= t;
square( t ) <= x - y
Outline:
• homework hints
1
Numerical linear algebra: factor and solve
• computation cost f + s
• use LU factorization, A = LU
examples setup:
we give naive but correct algorithms (in Matlab notation)
flop count:
ans: (8/3)n3 + 2n2 + 2n ≈ (8/3)n3
more efficient method:
ans: val = c’*(A\b) with flop count about (2/3)n3
flop count:
ans: (8/3)n3 + 2n2m + 2nm ≈ (8/3)n3 + 2n2m
naive algorithm:
x = [A, zeros(n,n); zeros(n,n), B] \ [b; c]
flop count:
ans: (2/3)(2n)3 = (16/3)n3
flop count:
ans: (2/3)(11n)3 = (2662/3)n3
more efficient method:
ans: we use elimination of variables to get equations
where B ∈ Rm×n, C ∈ Rn×m, and n > m; also assume that the whole
matrix is nonsingular
flop count:
ans: (2/3)(n + m)3
more efficient method:
ans: we use elimination of variables to get equations
Ax + By = c
Dx + Ey + F z = g
Hy + Jz = k
The system now looks like an ”arrow” system, which we can efficiently
solve by block elimination.
We know that
x
D F
+ Ey = g
z
then using the expression derived before
A c −1
A B −1
D F − z + Ey = g
J −1k J −1H
and therefore
• Form
M = A−1B, n = A−1c,
P = J −1H, q = J −1k.
• Compute r = g − Dn − F q.
• Compute S = E − DM − F P .
• Find
y = S −1r, x = n − M y, z = q − P y.
∇f (x) = Df (x)T ,
∂f (x)
∇f (x)i = , i = 1, . . . , n.
∂xi
f (x) = (1/2)xT P x + q T x + r,
n Pm
• Consider g : R → R, g(x) = log i=1 exp(xi ). Its gradient is
exp x1
1 ..
∇g(x) = Pm , (1)
i=1 exp xi exp xm
Let f : Sn → R.
One (tedious) way to find the gradient of f is to introduce a basis for Sn,
find the gradient of the associated function, and finally translate the result
back to Sn.
Now we use the fact that ∆X is small, which implies λi are small, so to
first order we have log(1 + λi) ≈ λi.
∇f (X) = X −1.
This result should not be surprising, since the derivative of log x, on R++,
is 1/x.
m
X
f (x) = log exp(aTi x + bi),
i=1
1
∇f (x) = AT z
1T z
∂h(x)
= tr(Fi∇ log det(F )) = tr(F −1Fi),
∂xi
D = diag(-ones(N,1)) + diag(ones(N-1,1),-1);
or