Escolar Documentos
Profissional Documentos
Cultura Documentos
• Temporal
T l Similarity
Si il i
• Total Similarity
Instantiations I: K-means
K means
• Snapshot quality:
• History cost:
• In
I eachh kk-means iteration,
i i the
h new
centroid between the centroid suggested
b non-evolutionary
by l i kk-means and
d iits
closest match from previous time step.
where
Agglomerative Clustering
• This is more complicated:
p need to find out the cluster
similarity between two trees (T, T’).
• Snapshot quality: the sum of the qualities of all merges
performed
f d to create T.
T
• History cost:
• 4 greedy heuristics (skipped here):
– Squared:
Experiment Setup
• Data: photo
photo-tag
tag pairs from flickr
flickr.com
com
• Task: Cluster tags
• Two
T tags are similar
i il if they
h both
b h occur at
the same photo
• However, the experiments in the paper
doesn’t make much sense for me
Comments
• Pros:
– New problem
– Effective heuristics
– Temporal smoothness is incorporated in both the
affinity matrix and the history cost.
• Cons
C
– No global solution.
– Can not handle the change of number of clusters.
clusters
– Experiment seems unreasonable.
Evolutionary Spectral Clustering
• Idea is almost the same,, but here focus on spectral
p
clustering, which preserves nice properties (global
solution to a relaxed cut problem, connections to k-
means).
means)
• But the idea is presented clearer here.
• Normalized Cut:
Check whether
current cluster fits
previous cluster.
• A hidd
hidden problem,
bl still
ill needs
d to fi
find
d the
h
cluster mapping.
Negated Average Association(1)
• Similar to K-means strategy:
gy
• As we know,
where ZTZ=Ik.,
So k-means
k means is actually a special case of negated average
association with a specific similarity definition.
Normalized Cut
• Normalized cut can be represented
p as
Again a trace
• We have
maximization
problem.
Discussion on PCQ framework
• Very intuitive
• The historic similarity matrix is scaled and
combined with current similarity matrix.
matrix
Preserving Cluster Membership
• Temporal cost is measured as the difference
between current partition and historical partition.
• Use chi-square
q statistics to represent
p the distance:
So for K-means
Negated Average Association(1)
• Distance:
• So
Negated Average Association(2)
• It can be shown that the unrelaxed
partition:
• Connection: