Você está na página 1de 3

Problem statement

1. What makes the clustering separated?

2. What makes there is overlapping? Why?
Differences between biclustering and clustering

Biclustering Clustering
clustering of the rows and columns simultaneously, some genes i. General clustering - divide the genes into different clusters, that
show the similar expression pattern under certain conditions. is to say a gene or a condition only belongs one cluster.
- a gene may participate in more than one biological
process, so a gene may belong to multiple clusters.

ii. In addition, all the rows or all the conditions are taken into
consideration in traditional clustering algorithms.
- cellular processes, certain genes may only be co-
regulated and co-expressed under certain experimental
conditions rather than all conditions.
The identification of subsets of genes showing a coherent pattern Classical clustering cannot extract information on coherent pattern of
of expression in subsets of objects/samples can provide crucial expression.
information about active biology processes.
- Where each row corresponds to a different
object/sample and each column to a different feature.

gene-sample modules are first constructed based on gene Difficult to address multiple relationships between genes and miRNAs.
expression and gene-gene interaction data sets.
- a subset of genes that are correlated with each other in
a subset of samples is clustered, because gene
aberrations are different among patients, even if
cancer occurs in the same organ or tissue type
Biclustering technology in gene expression data analysis, which General clustering methods usually cluster the genes into mutually
simultaneously clusters rows and columns to find the subset of disjoint subsets, so that the genes or conditions cannot belong to more
genes under some certain experimental conditions. than one cluster.
- Gene may participate in more than one biological process,
so a gene may belong to more than one gene cluster.
- In addition, clustering algorithms usually cluster rows or
columns, which represent global patterns.
- In the cellular processes, some genes may only have a
consistent evolution trend in a particular experimental
condition set
Can simultaneously group objects and features based on the co- each object belongs to exactly one cluster
occurrence information.
identifies subgroups of genes that show similar activity patterns many activation patterns are common to a group of genes only under
under a specific subset of the experimental conditions specific experimental conditions. In fact, our general understanding of
cellular processes leads us to expect subsets of genes to be co-
regulated and co-expressed only under certain experimental
conditions, but to behave almost independently under other conditions.
effectively identify bi-clusters which can represent some genes only focused on data representing the axiality of genes or axiality of
highly related to “partial” specific experimental conditions (or experimental conditions (or samples) in order to identify the
samples) relationship among genes or the relationship among experimental
conditions (or samples).
clusters both samples and features. In co-clustering, similarity is a Clustering method can be applied to partition the columns/rows of this
measure of the coherence of features (e.g. genes) and samples in matrix into different clusters such that items in one cluster have similar
a bicluster, rather than a function of feature pairs or sample pairs. expression patterns. The partition of columns offers clues to potential
Consequently, it considers the local context and is able to cancer subtypes, while the partition of rows can highlight potentially
automatically select subsets that share similar attributes. relevant co-expressed genes.
The biclustering algorithm tries to find a subset of the genes
representing similar behavior under multiple conditions.
partition the set of genes into disjoint groups according to the similarity
of their expression patterns over all conditions. Thereby, they may fail
to uncover processes that are active only over some but not all
A bicluster is defined as a group of genes showing similar not able to correct identify the gene clusters, because they consider
regulation behaviour over a subset of experimental conditions. the expression profiles under all conditions at the same time.
Moreover biclustering allows obtaining overlapping biclusters, in
which a gene can be involved in different regulation patterns
according to the groups of considered conditions.
cluster genes and conditions simultaneously such that we see a It partitions a data set into different clusters such that elements within a
consistent “behavior” cluster are more similar to each other than to those objects belonging
to different clusters, according to a certain criterion.
Find sub groups of gene and conditions such that subset of sophisticated clustering algorithms that groups genes into biologically
conditions shows considerable homogeneity within a subset of meaningful groups based on their expression level

For example, in the gene expression data of patients with the same incapable of discovering the gene expression pattern visible in only a
disease, the genes interfering with the progression of this disease subset of experimental conditions. In fact, it is common that a subset of
shall behave similarly in terms of relative expression levels on this genes are co-regulated and co-expressed under a subset of
set of patients. These types of pattern can be observed in data conditions, but behave independently under other conditions
from nominally identical exposure to environmental effects, data
from drug treatment, and data representing some temporal
progression, etc.
assume that genes in a cluster behave similarly over all the conditions
presented in a microarray experiment.

performs simultaneous row and column clustering goal of this procedure is to discover local expression patterns
representing subsets of genes to be coregulated and coexpressed only
under certain experimental conditions.
discover groups of genes with the same behavior under a specific
group of conditions.