Escolar Documentos
Profissional Documentos
Cultura Documentos
Introduction
I
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
45
50
30
35
55
40
60
45
50
65
55
70
60
Notations
We will denote by:
I
D the dimensionality
N
1 X
xn
N
(1.90)
n=1
N
1 X
(xn x
)(xn x
)T
N
(1)
n=1
(2)
(3)
n=1
Idea of PCA: Maximize the projected variance uT1 Su1 with respect
to u1 under the normalization constraint uT1 u1 = 1
Caroline Bernard-Michel & Herv
e Jegou
I
I
I
I
I
(4)
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
Minimum-error formulation
I
I
I
D
X
ni ui where ni = xTn ui
(5)
i=1
I
xn can be approximated by
x
n =
M
X
zni ui +
i=1
I
D
X
bi u i
(6)
i=M +1
(7)
n=1
I
I
I
(8)
1
N
D
X
uTi Sui
(9)
i=M +1
M
X
i=1
(xTn ui )ui
D
X
i=M +1
(
x ui )ui = x
+
M
X
(xTn x
T ui )ui
i=1
(10)
Caroline Bernard-Michel & Herv
e Jegou
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
Individuals = images
Variables = grey levels of each pixel (784)
5
x 10
x 10
2
2
1
1
0
Mean
200
400
(a)
600
200
400
(b)
600
M
X
{xn ui x
ui }ui
(11)
i=1
I
Original
I
I
100
90
80
70
60
50
2
40
2
2
2
Vizualization: projection of the oil data flow onto the first two
principal factors. Three geometrical configurations of the oil,
water and gas phases.
Stratified
Annular
Oil
Water
Gas
Mix
Homogeneous
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
1
(N i )
1
2
X T vi
(12)
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
Probabilistic PCA
Advantages:
I
...
(13)
where
I
I
I
I
I
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
=x
(17)
ND
ln(2) + ln | C | +T r(C 1 S)
2
(18)
WM L = UM (LM 2 I) 2 R
(19)
where
I
2
M
L =
1
DM
D
X
(20)
i=M +1
(21)
where M = W T W + 2 I
The mean is given by
T
E(z/x) = M 1 WM
)
L (x x
(22)
(23)
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
N
X
D
1
T
{ ln(2 2 ) + Tr(E[zn zn
])
2
2
n=1
1
1
1
T
T
|| xn ||2 2 E[zn
]W T (xn ) +
Tr(E[zn Zn
]W T W )}
2 2
2 2
with
E[zn ] = M 1 W (xn x
)
T
E[zn zn
] = 2 M 1 + E[zn ]E[zn ]T
M-step
Wnew = [
N
X
(xn x
)E[zn ]T ][
n=1
2
new
=
N
X
T 1
E[zn zn
]]
(25)
n=1
N
1 X
T
{|| xn x
||2 2E[zn ]T Wnew
(xn x
)
N D n=1
T
+ Tr(E[zn zn
]Wnew Wnew )}
(26)
(27)
(a)
(b)
(d)
(e)
(c)
(f)
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
M
Y
i
1
( )D/2 exp{ i iT wi }
2
2
(28)
i=1
(30)
N
X
(xn x
)E[zn ]T ][
n=1
I
D
wiT wi
N
X
(31)
n=1
with A = diag(i )
Example: 300 point in dimension D sampled from a Gaussian distribution having M = 3 directions
with larger variance
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
Factor analysis
I
(64)
For PCA,
rotation of data space same fit with W rotated with the
same matrix
For Factor Analysis, the analogous property is: component-wise
re-scaling is absorbed into the re-scaling elements of
Factor analysis
(65)
=x
, as in probabilistic PCA
E step
E[zn ] = GW T 1 (x x
)
E[zn znT ]
(66)
= G + E[zn ] E[zn ]
(67)
where G = (I + W T 1 W )1
I
M step
"
W new =
N
X
#"
(x x
)E[zn ]T
= diag S W
#1
E[zn znT ]
(69)
n=1
new
N
X
new
N
1 X
E[zn ](xn x
)T
N
)
(70)
n=1
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
Kernel PCA
Applying the ideas of Kernel substitution (see Chapter 5) to PCA
x2
x1
v1
(71)
(72)
n=1
Kernel PCA
I
I
P
Let assume that n (xn ) = 0
the M M sample covariance matrix C in feature space is
given by
N
1 X
C=
(xn )(xn )T
(73)
N
n=1
(74)
M
X
ain (xn )
(76)
n=1
Caroline Bernard-Michel & Herv
e Jegou
Kernel PCA
Note: typo in (12.78)
I
(79)
(80)
(81)
Kernel PCA
I
N
X
ain k(x, xn )
(82)
n=1
I
Remarks:
I
I
I
Kernel PCA
(xn ) = 0
i=1
I
(xn ) = (xn )
(xl )
N
(83)
l=1
Kernel PCA
I
N
1 X
k(xl , xm )
N
l=1
N
X
l=1
N N
1 XX
k(xn , xl ) + 2
k(xj , xl )
N
(84)
j=1 l=1
i.e.,
= K 1N K K1N + 1N K1N ,
K
where
1N
1/N
...
=
1/N
(85)
... 1/N
...
... 1/N
I
I
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
M
Y
p(zj )
(86)
j=1
Setup:
I
I
10
p(zj ) =
1
1
=
z
cosh(zj )
(e j + ezj )
(90)
10
10
10
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
zM
outputs
inputs
z1
x1
xD
x1
E(w) =
1X
||y(xn , w) xn ||2
2
(91)
n=1
F2
xD
xD
outputs
inputs
x1
x1
non-linear
F1
z2
x3
F2
S
x1
x2
z1
x1
x2
Chapter content
I
Probabilistic PCA
I
I
I
I
I
I
I
Kernel PCA
Nonlinear Latent Variable Models
I
I
I
I
I
Principal curves:
I
I
I
I
(92)
I
I
I
I
I
I