Você está na página 1de 7

Supplementary data

Association between
microRNA regulation and
cross-species variation of
gene expression
Qinghua Cui, Zhenbao Yu, Enrico O. Purisima and Edwin Wang
Biotechnology Research Institute, National Research Council Canada, Montreal, Quebec, H4P 2R2,
Canada
Corresponding Author: Wang, E. (edwin.wang@cnrc-nrc.gc.ca)
Datasets used in this study

D1. MiRNA targets and microarray data


The miRNA target datasets used in this study include a human miRNA target dataset reported by
Krek et al.1 and a Drosophila miRNA target dataset reported by Guan et al.2. The microarray gene
expression profiling datasets include an expression profile of human and chimpanzee genes in five
tissues3, a multi-tissue gene expression profile of mouse and human4, and a gene expression dataset of
two Drosophila species, D. melanogaster and D. simulans5. The human TATA-box gene dataset
reported by Tirosh et al.6 was kindly provided by Dr. Naama Barkai.
D2. Genes used to predict miRNA targets
The lists of the human and Drosophila genes used for miRNA target predictions by Krek et al.1 and
Guan et al.2, respectively, were kindly provided by Dr. Nikolaus Rajewsky. These genes were used in
this study as “background genes” to miRNA targets in comparing the gene expression variations across
species.
Supplementary methods

M1. Microarray data analysis


For human-chimpanzee gene expression data, we used affy, an R package of the Bioconductor project,
to calculate the expression intensity values and normalize the data.
We intersected the human “background genes” used for miRNA target prediction and the human-
chimpanzee orthologs reported by Khaitovich et al.3. The gene expression profiles of the intersected
genes were then extracted. For human-mouse gene expression data, we followed Liao et al.’s method7.
To assign the probe sets to the currently annotated version of human and mouse Ensembl genes, we
aligned sequences of each probe set to the Ensembl cDNA sequences using blastn. The intensity value
was calculated by MAS5 algorithm and the data were normalized by affy package7. The human-mouse
orthologs were also intersected with the human “background genes”. The gene expression profiles of the
intersected genes were then extracted. For the Drosophila microarray data, the gene expression data
have been already normalized by Ranz et al.5. We first defined the orthologs between D. melanogaster
and D. Simulans and then intersected them with the fly “background genes”. The gene expression
profiles of the intersected genes were then extracted.
For human-chimpanzee and human-mouse data, we log2-transformed the data before calculating
cross-species gene expression variation (EV) because the gene expression data have a non-symmetric
distribution. As the Drosophila gene expression data have already had an approximate symmetric
distribution, we did not apply log2 transformation to data.
M2. Calculation and normalization of cross-species gene expression variation
We used the method presented by Tirosh et al.6 to calculate the gene expression variation across
different species.

1
2
⎛ xi ( g , k ) − x j ( g , k ) ⎞
1 ⎜ ⎟
EVi , j ( g ) =
A
∑ ⎜
k∈ A x i ( g , k ) + x j ( g , k ) + c

⎝ ⎠

Where EVi,j(g) is the gene expression variation of gene g between species i and j. A is the set of
conditions being compared, xi(g,k) is the expression value of gene g in species i and condition k, and c is
a free parameter. As described by Tirosh et al.6, we used the log function to transform EV to normal
distribution and further normalized EV to a mean of 0 and standard deviation of 1.
M3. Analysis of the relationship between miRNA targets and cross-species gene expression variation
We used two methods to analyze the relationship between miRNA targets and cross-species gene
expression variation. To avoid the effect of sampling, we selected only the genes that were used for
miRNA target prediction as “background genes” and filtered out all other genes (see above). As a result,
10,703 of 11,548 genes in Khaitovich et al.’s dataset3, 8,031 of 8,904 genes in Su et al.’s dataset4 and
10,770 of 12,865 genes in Ranz et al.’s dataset5 were selected for this study.
In the first method, we divided the selected genes into two groups. One group contains miRNA
targets and the other group contains the genes that are not miRNA targets. We then calculated the
average cross-species expression variation (EV) of the total genes in each group. We also obtained 400
experimentally validated human miRNA targets and 41 experimentally validated Drosophila miRNA
targets from Tarbase (http://www.diana.pcbi.upenn.edu/tarbase.html). We did the same analysis using
these experimentally validated targets. In addition, using the same methods, we investigated the EV
distribution of the targets of individual miRNAs in the tissues where the miRNAs were expressed.
In the second method, we ranked the genes by the cross-species EV values. We then used a window-
shift method to calculate the average EV value of the genes and the ratio of miRNA targets to the total
genes in each window. To do so, we set a window size (for example 2,000 genes) and a step size (for
example 50 genes). We calculated the average EV value and the ratio of miRNA targets to the total
2,000 genes in each window. We shifted the window by 50 genes and did the same calculations until we
reached the end.
Supplementary results

R1. Gene expression variations of the experimentally validated miRNA targets


We calculated EV of the 400 experimentally validated human miRNA targets and 41 experimentally
validated Drosophila miRNA targets. We then compared the EV values to that of the total genes. As
shown in Supplementary Table 1, the average EV values of experimentally validated miRNA targets
are less than that of the “background genes” for all of the three datasets.
R2. Cross-species gene expression variations of the miRNA targets whose miRNAs are known to be
expressed in particular tissues
We also investigated the EV distribution of the targets of specific miRNAs in the tissues where the
respective miRNAs are expressed. We focused on several human miRNAs whose expression tissues are
well known. As shown in Supplementary Table 2, for each miRNA, the average EV value of its targets
is less than that of the “background genes”.
R3. Cross-species expression variations of “non-conserved miRNA targets”
MiRNA targets are computationally predicted based on the Watson-Crick match between miRNA and
the 3-UTR regions of mRNAs. In our study, the human miRNA targets that were predicted based on
the gene’s 3-UTR conservation between five mammals1 can be called “conserved miRNA targets”. We
found that “conserved miRNA targets” show significant low gene expression variations. We asked if
this observation could be extended to “non-conserved miRNA targets”. To answer this question, we
downloaded a set of human non-conserved miRNA targets (4,337) that are co-expressed with their
miRNAs from a recent report8. A few studies suggested that many non-conserved miRNA targets are
functional when both miRNA and its targets are co-expressed in the same tissues9-11. We performed the
same analysis using these 4,337 “non-conserved miRNA targets”. We found that these miRNA targets
also have significantly low gene expression variations (Supplementary Table 3).
R4. Relationship between human TATA box-containing genes and miRNA-targeting genes

2
There are totally ~9,000 genes having TATA box assignment as reported by Tirosh et al.6. We
investigated the EV distribution of TATA box-less genes and TATA box-containing genes. As
demonstrated by Tirosh et al.6, we confirmed that the EV of TATA box-containing genes is significantly
greater than that of TATA box-less genes for both human-chimpanzee and human-mouse gene
expression variation. For human-chimpanzee, the average EV of TATA box-containing genes is 0.2745,
which is much greater than that (-0.1013) of TATA-less genes (P<1.56×10-12). For the human-mouse
dataset, we obtained a similar result (0.0613 vs. -0.0452, P<0.02). We next investigated the miRNA
target distribution in TATA box-less genes and TATA box-containing genes. By intersecting the TATA
box assigned genes (~9,000) with the human “background genes”, we obtained 5,978 TATA box-less
genes, 1,842 unresolved genes, and 645 TATA box-containing genes. From these three groups of genes,
we found 2,765, 801, and 245 miRNA targets, respectively. Therefore, 46% (2,765/5,978) and 38%
(245/645) of genes in the TATA box-less gene group and TATA box-containing gene group are miRNA
targets, respectively. MiRNA targets are more enriched in TATA box-less genes (P<6.0×10-5, Fisher’s
exact test), suggesting that TATA box is negatively correlated with miRNA targets.
To investigate the possibility that the correlation between miRNA targets and EV is mediated by the
TATA-box, we further analyzed the relation between miRNA targets and EV in the TATA box-
containing gene group and the TATA-less gene group, respectively. We found a negative correlation in
both TATA box-containing group (r=-0.98, P<3.1×10-6 for human-chimpanzee, and r=-0.80, p<0.1 for
human-mouse) and TATA box-less group (r=-0.77, P<8.0×10-15 for human-chimpanzee, and r=-0.23,
p<0.05 for human-mouse). This result suggests that the correlation of miRNAs with cross-species gene
expression variation is not related with TATA box although miRNA targets are enriched in the TATA
box-less gene groups.
References
1 Krek,A. et al. (2005) Combinatorial microRNA target predictions. Nat. Genet. 37, 495-500
2 Grun,D. et al. (2005) microRNA target predictions across seven Drosophila species and comparison to mammalian targets.
PLoS. Comput. Biol. 1, e13
3 Khaitovich,P. et al. (2005) Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees.
Science 309, 1850-1854
4 Su,A.I. et al. (2004) A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. U. S. A 101,
6062-6067
5 Ranz,J.M. et al. (2003) Sex-dependent gene expression and evolution of the Drosophila transcriptome. Science 300, 1742-1745
6 Tirosh,I. et al. (2006) A genetic signature of interspecies variations in gene expression. Nat. Genet. 38, 830-834
7 Liao,B.Y. and Zhang,J. (2006) Low rates of expression profile divergence in highly expressed genes and tissue-specific genes
during mammalian evolution. Mol. Biol. Evol. 23, 1119-1128
8 Chen,K. and Rajewsky,N. (2006) Natural selection on human microRNA binding sites inferred from SNP data. Nat. Genet. 38,
1452-1456
9 Krutzfeldt,J. et al. (2005) Silencing of microRNAs in vivo with 'antagomirs'. Nature 438, 685-689
10 Stark,A. et al. (2005) Animal MicroRNAs confer robustness to gene expression and have a significant impact on 3'UTR
evolution. Cell 123, 1133-1146
11 Farh,K.K. et al. (2005) The widespread impact of mammalian MicroRNAs on mRNA repression and evolution. Science 310,
1817-1821

3
Table S1. Average cross-species gene expression variations of experimentally validated miRNA targets
Drosophila melanogaster
EV Human/chimpanzee Human/mouse
/Drosophila simulans
MiRNA targets -0.1395 -0.1968 -0.2882
Global 0.0 0.0 0.0
P value 0.01 0.001 0.036

Table S2. Average cross-species gene expression variations of miRNA targets of specific miRNAs in the
tissues where the respective miRNAs were expressed
MiRNA Tissue EV(targets/global) Species P value
MiR-1 Heart -0.2068/ 0 Human/chimpanzee 1.57e-5
MiR-1 Heart -0.1299/0 Human/mouse 0.02
MiR-1 Skeletal muscle -0.1037/0 Human/mouse 0.06
MiR-122a Liver -0.1641/0 Human/chimpanzee 0.03
MiR-122a Liver -0.5421/0 Human/mouse 5.53e-5
MiR-124a Brain -0.2186/0 Human/chimpanzee 4.8e-8
MiR-124a Brain -0.1215/0 Human/mouse 0.01
MiR-125b Brain -0.1778/0 Human/chimpanzee 3.0e-4
MiR-125b Brain -0.2282/0 Human/mouse 8.0e-5
MiR-16 Thymus -0.1126/0 Human/mouse 0.02
MiR-181a Bone marrow -0.9086 /0 Human/mouse 0.07
MiR-181a Thymus -0.1437/0 Human/mouse 0.01
Note: in human-chimpanzee dataset, only five tissues have gene expression profiles.

Table S3. Average cross-species gene expression variations of non-conserved miRNA target genes
Average EV Average EV
Human-chimpanzee Human-mouse
Non-conserved miRNA targets -0.174 -0.039
Non-miRNA-targets 0.122 0.063
-16 -5
P value 2.2×10 1.6×10

4
Figure S1. Cross-species gene expression variation (EV) of miRNA target genes and non-miRNA-target genes in the individual tissues. (a)
Human genes are divided into two groups, one group containing miRNA targets and the other group containing the genes that are not
miRNA targets (non-miRNA-target). The average EV of the genes in each group was calculated for each of the five tissues, brain, heart,
kidney, liver and testis. The human and chimpanzee gene expression data were from Khaitovich et al.’s report. (b) The EV between human
and mouse in each tissue was calculated using a dataset reported by Su et al., in which 26 overlapping human and mouse tissues were
present.

5
Figure S2. Cross-species gene expression variation (EV) of miRNA target genes for each miRNA compared to the EV of total genes. The
average cross-species gene expression variation (EV) of the targets for one miRNA was calculated and divided by the EV of the total genes.
(a) human and chimpanzee. (b) human and mouse. (c) D. melanogaster and D. simulans.

6
Figure S3. Cross-species gene expression variation (EV) of miRNA target genes in each Gene Ontology (GO) group. The average cross-
species EV of miRNA targets and the genes which are not miRNA targets in each GO group were calculated, respectively. All GO groups that
contain more than 200 genes were selected. (a) human and chimpanzee. (b) human and mouse.

Você também pode gostar