Escolar Documentos
Profissional Documentos
Cultura Documentos
专论与综述
在气候变化、环境污染生物清除、农田肥力保持等重要领域, 以宏基因组学为基础的系列组学
技术的应用已极大地加深了人们对科学规律的认识。
杨云锋
Abstract: To discover and characterize microbial members within their community and their
roles in the environment, high-throughput omics approaches are being developed and applied.
Among them, sequencing- and microarray-based metagenomics is the most mature key
approach, providing information basis for most other omics technologies. Metatranscriptomics,
metaproteomics and community metabolomics have had limited successes but already shown
promising potentials. All of the omics approaches lie on the support of bioinformatics, which
becomes a major bottleneck in omics applications. These new omics technologies are
revolutionizing the field of environmental microbiology in revealing the genetic potentials and
functional activities of microbial communities.
Foundation item: National Science Foundation of China (No. 41171201); State Key Joint Laboratory of Environment Simula-
tion and Pollution Control (No. 11Z03ESPCT)
*Corresponding author: Tel: 86-10-62784692; : yangyf@tsinghua.edu.cn
Received: October 25, 2012; Accepted: November 30, 2012
YANG Yun-Feng. Omics breakthroughs for environmental microbiology 19
环境微生物学的组学技术应用和突破
短标题: 环境微生物学的组学技术
杨云锋
( 100084)
摘 要: 由于研究环境变化和微生物群落的需要, 近年来高通量组学技术得到了迅猛开
发和应用。其中, 基于测序和芯片技术的宏基因组学是一个关键的、最成熟的组学技术,
为大多数的其它组学技术提供了支撑。相比较而言, 宏转录组学、宏蛋白质组学和宏代谢
组学也取得了少数的有限成功, 但已经显示出可喜的潜力。所有的组学技术都有赖于生物
信息学, 使得后者成为组学技术应用的一个主要的技术瓶颈。这些新的组学技术对环境微
生物学领域产生了革命性的影响, 极大地丰富了我们对于环境微生物基因资源和功能活
性的了解。
http://journals.im.ac.cn/wswxtbcn
20 微生物学通报 Microbiol. China 2013, Vol.40, No.1
prove the success in cultivating the uncultivable ronments and served as an indispensable informa-
microbes. tion basis for most other omics applications to as-
Microbial activity and growth are constrained sess microbial community and ecosystem function-
by the physical and chemical properties of natural ing (Fig. 1).
habitats. It has been shown that soil particle size is Large-scale metagenomics projects have
negatively correlated to the diversity and number of quickly come to depend entirely on so called the
bacteria[21]. Meanwhile, temperature is an important 2 nd -generation DNA sequencing. Roche 454 ge-
controlling factor since all of living organisms have nome sequencer was the first commercially avail-
an optimal spectrum of growth temperature. In able one among this kind, which employs a se-
addition, it has been shown that moisture, pH and quencing-by-synthesis pyrosequencing technol-
organic matter content are also often crucial in ogy[24]. The unique features of 454 pyrosequencing
shaping microbial communities[22]. include emulsion PCR thermal cycling on sepharose
or Styrofoam beads to amplify DNA fragments, the
2 Metagenomics-sequencing use of a picotiter plate (PTP) that contains more
Meta is a Greek word with the meaning of af- than one million wells per plate but can hold only
ter or beyond. Metagenomics can be defined as the one bead per well, and the release of a pyrophos-
genome research of the collective microbial assem- phate molecule when a nucleotide is incorporated
blage in environmental samples[23]. A major advan- during DNA synthesis (where the word pyrose-
tage of metagenomics is to identify novel functional quencing is derived from). Today, Roche 454 ge-
genes. It is of little debate that the technical ad- nome sequencers have two versions-GS FLX+ sys-
vancement of DNA sequencing in the past two tem and junior system with 1 M and 100 k reads
decades is the most pivotal driver that revol u- capacity, respectively. Illumina sequencers, com-
tionizes the field of environmental microbiology. To monly referred to as Solexa, were commercialized
date, sequencing-based metagenomics approaches in 2007. It uses sequencing-by-synthesis technology
have been applied to a number of different envi- coupled with bridge amplification in a flow cell[25−26].
http://journals.im.ac.cn/wswxtbcn
YANG Yun-Feng. Omics breakthroughs for environmental microbiology 21
The unique features include the linkage between Illumina and SOLiD sequencing costs 20 times less.
adaptors ligated to DNA fragments and complemen- Thus they impose an obvious advantage over 454
tary oligos attached to the interior surfaces of a flow pyrosequencing since it is unlikely for the latter to
cell, which allows DNA amplification, and the in- dramatically improve the sequencing capacity soon.
corporation and identification of fluorescent nuc- Another major drawback of 454 pyrosequencing is
leotide during DNA synthesis. There are currently the high errors in sequencing homopolymers, which
several versions. The HiSeq2500 platform can gen- are consecutive same nucleotides. In comparison,
erate up to 120 Gb of sequencing data in roughly a homopolymers are certainly less of an issue for Il-
day, whereas a low-end MiSeq platform can gener- lumina. The major advantages of Illumina and
ate 1.5−2.0 Gb per run. Unlike Roche 454 pyrose- SOLiD are the high sequencing capacity per run
quencing and Illumina technologies, the Applied and the acceptably low rate of sequencing errors.
Biosystems SOLiD sequencer uses a sequenc- The sequencing capacity is an important criterion,
ing-by-oligo-ligation technology, which targets oc- which improves the coverage depth of sequencing,
tamer DNA fragments during a DNA ligation reac- referring to one genome equivalent, and can correct
tion[27]. Another feature of SOLiD is the use of sequencing errors by additional coverage depths.
two-base encoding as an error-correction scheme to However, their main drawback is the short-read
identify miscalls. The current SOLiD platforms length, which is inclined to generate more gaps and
have two versions: the 5500 system with 100 Gb difficult to align, assign and annotate the sequences.
capacity, and an improved 5500xl system with 250 Ion torrent is a relatively new technology. Similar to
Gb capacity. Life Technologies Ion Torrent uses a 454 pyrosequencing, it delivers high read-length
post-light sequencing technology by detection of sequences, but the capacity is low compared to Il-
hydrogen ion released during DNA synthesis. The lumina and SOLiD. Notably, all of the
314, 316 and 318 series of ion chips can generate 2nd-generation DNA sequencing technologies are
10 Mb, 100 Mb and 1 Gb of sequencing data, re- prone to the errors and biases derived from DNA
spectively. Ion Proton I and II chips, which is amplication. In this regard, the 3rd-generation DNA
coupled to the Ion Proton bench top sequencer, have sequencing technologies, which are sin-
higher capacities of 1 Gb and 10 Gb, respectively. gle-molecule, non-PCR sequencing technologies,
There are pros and cons of these 2nd-generation are appealing in that it targets single cells and eli-
DNA sequencing platforms. The major advantages minates the biases introduced by template amplifi-
of Roche 454 pyrosequencing are its long read cation. In addition, they promise remarkably high
length (300–500 bp) and acceptably low rates of read-lengths (e.g. Pacific Biosciences). However,
sequencing errors. The read length is a critical fac- there are still issues such as high rates of sequenc-
tor in matching sequences to those in GenBank or ing errors, making the commercialization primitive
other databases[18] in order to generate accurate an- at the time of this manuscript writing.
notation of novel genes. To date, 454 pyrosequenc- Metagenomics is the forefront of environmen-
ing is the most commonly used platform for meta- tal microbiology research in recent years. It has
genomics. However, pyrosequencing relies on been widely applied to analyze microbial commu-
PCR-based DNA amplification, thus environmental nity compositions in a variety of environmental
contamination imposes a challenge in inhibiting samples from terrestrial, soil, marine, freshwater,
DNA amplification. Also, the sequencing capacity gut microbiota, rare and extreme habitats such as
of 454 pyrosequencing is low, thus the cost is high deep sea floors and acid mines. Among them, 454
per megabase output[28]. It is roughly estimated, pyrosequencing examining 16S rDNA amplicons
bearing in mind the caveat that the estimate may has been widely adopted by a large number of re-
change rapidly, that pyrosequencing sequencing search groups (to list a few, [30−36]). An important
costs 0.01 cent per nucleotide[29]. In comparison, point based on many of these studies is that gener-
http://journals.im.ac.cn/wswxtbcn
22 微生物学通报 Microbiol. China 2013, Vol.40, No.1
ally the microbial community compositions ap- but on whether data analysis and DNA sampling
peared to be sensitive to anthropogenic and natural methods will allow the completion of the project.
disturbance, unveiling a potential to use microbes as The cost, labor and time required for sequencing
sensitive bioindicators of environmental quality and has been reduced dramatically in the past two dec-
changes. In addition, selected functional gene am- ades, as a result of the rapid technology develop-
plicons of microbial community have also been ment, and this trend is likely to continue. Despite
targeted[37−40]. Meanwhile, Illumina and Ion Torrent serious drawbacks for de novo sequencing, 454 py-
sequencing have been successfully applied for mi- rosequencing has made significant progress in pro-
crobial community assessment using short 16S viding useful information about microbial commu-
rDNA regions[41−42]. nity[56]. However, as read length is gradually ex-
An important but often ignored issue of meta- tended, Illumina technology is likely to prove more
genomics is to extract DNA representative of total cost-effective at current pricing and output and thus
microbial diversity, which is particularly important outweigh 454 pyrosequencing in the near future. We
in light of recent recognition of the importance to predict that mass de novo sequencing of microbial
assess the rare biosphere for its functional roles[43]. community desires more advanced and low-cost
In consideration of environmental heterogeneity, it technologies than the current ones on the market.
is advisable to pool multiple samples prior to analy- However, such technologies will soon be available,
sis. The environmental matrix and contaminants are given the current rate of technology advancement.
also factors to be considered since DNA recovery is When it occurs, the burden of difficulty will proba-
highly dependent on the environmental matrix. In bly be shifted from the laboratory-based research to
soil, the strong adherence of microbial cells and the interpretation of sequence information, calling
DNA to soil matrix makes DNA recovery ineffi- for significant advances in bioinformatics and
cient[44−45]. Meanwhile, contaminants such as humic computational biology to solve algorithmic chal-
acids, polyphenols, polysaccharides and nucleases lenges in dealing with the issues of read-length, raw
can interfere with efficient DNA amplification or accuracy and throughput. Notably, the sequencing
labeling[45−48]. In general, DNA extraction should be cost is not necessarily the major cost of metage-
harsh enough to lyse a variety of microbes, but with nomics. With the astonishing rate to produce se-
the permissible degree of DNA fragmentation. A quencing data, storage and analysis costs have be-
soil grinding method with the use of liquid nitrogen come a significant part of the budget. For all
has been widely adopted[47]. Nevertheless, commer- second-generation technologies, the availability of
cial kits (e.g., MoBio Laboratories, Qiagen) can bioinformatics software is not yet optimized to the
provide acceptable DNA yields under many cases. level that makes researchers satisfactory.
Regardless of DNA extraction methods, subsequent
DNA purification is usually required. A combina- 3 Metagenomics-microarray
tion of phenol and chloroform is typical, and treat- The use of microarrays is an excellent alterna-
ment with hexadecyltrimethylammonium bromide tive to high-throughput DNA sequencing to target
(CTAB)[49−53] is often useful. Unfortunately, purifi- functional genes and phylogenetic markers. Micro-
cation steps inevitably result in a significantly de- arrays have a number of advantages in that they are
creased DNA yield and increased processing time. high-throughput, specific, quantitative, free of bi-
Occasionally, the issue of DNA purification can be ases of PCR amplification and random sampling[57],
alleviated by diluting DNA along with contami- and allow for comparison between multiple sam-
nants[54], or by addition of bovine serum albumin to ples. However, the probe design limits to existing
prevent inhibition of humic acids[55]. sequence database, thus microarrays can only detect
Based on the discussion, the choice of se- what is already known. Therefore, microarrays are
quencing technologies depends not only on the cost, commonly used for generally characterized ecosys-
http://journals.im.ac.cn/wswxtbcn
YANG Yun-Feng. Omics breakthroughs for environmental microbiology 23
http://journals.im.ac.cn/wswxtbcn
24 微生物学通报 Microbiol. China 2013, Vol.40, No.1
into account of gene variants with high homologies, have prevented the wide application of metatran-
even though many of them are unknown sequences scriptomics, despite recent efforts to address them.
in the environment. The first study of this kind successfully recovered
mRNAs encoding proteins functioning in carbon
4 Metatranscriptomics and nitrogen cycling from freshwater and marine
While mass de novo whole-genome sequenc- bacterioplanktons[77]. Using metatranscriptomics,
ing of microbial community is not yet feasible due immense metabolic diversity was unveiled in the
to the cost or technical difficulty, an alternative is to ocean since a large fraction of environmental
focus on protein-coding sequence by sequencing mRNAs was novel[78]. Similarly, a large fraction of
mRNA[72], which is called metatranscriptomics. soil mRNAs were indicative of novel genes, signi-
Metatranscriptomics randomly sequences mRNA fying the importance of exploring the unknown[79].
fragments and requires computational assembly, A milestone publication utilizing metatranscrip-
thus serves as readily accessible ‗entry point‘ for tomics came out in 2008, in which it was shown
studying microbial community[73]. It represents the that key metabolic genes such as photosynthesis,
next logical step to reach beyond genomic poten- carbon and nitrogen cycling were highly
tials of the microbial community toward in situ ac- represented in RNA pools of marine surface wa-
tivity, since gene transcription is required for func- ter[78]. Further analysis showed that a large fraction
tioning. Conversely, metatranscriptomics can be uti- of ocean water RNA pools is comprised of microbi-
lized for gene discovery[74]. To date, metatranscrip- al small RNAs[80]. Many of small RNAs were
tomics projects use 454 pyrosequencing for its rela- mapped to intergenic regions of microbial genomes
tive long read-lengths[73,75−76] because very generated from similar habitats, suggestive of po-
short-read technologies (Illumina or Solid) prevent tentially new regulatory elements. Although meta-
efficient de novo assembly of transcripts that do not transcriptomics is not yet a quantitative technology,
have a closely related template as efficient. Also, there has been effort to use transcriptomics profiling
the resolution of genes with similar sequences is for comparative study. More transcripts of photo-
more challenging for Illumina or Solid technolo- synthesis, carbon metabolism and oxidative phos-
gies. However, given the advantage in sequence phorylation were present in the surface ocean mi-
output, very short read sequencing technologies crobial community for the day samples than the
(Illumina or Solid) will provide sufficient depth for night ones, and Cyanobacteria transcripts were ab-
effective assembly in many genes, which is impor- undant throughout the day/night cycles[81]. In soil,
tant since it allows better sequence assembly as well metatranscriptomics was used to profile the com-
as coverage of rare mRNA sequences. In this re- munity diversity of forest soil eukaryotic micro-
gard, it is also likely that there is a shift from 454 bes[79]. An interesting result was that the taxonomic
pyrosequencing to Illumina, since its read-length distribution from metatranscriptomics and metage-
continues to be improved. Regardless, the cost of nomics did not coincide. This suggested that the
metatranscriptomics is likely to fall quickly in the omics profiling was highly incomplete, signifying
near future, thanks to the advancement of sequenc- the importance to develop and utilize sequencing
ing technologies to analyze large numbers of mi- technologies with higher capacity. Similarly, a more
crobial cDNAs quickly and cheaply. recent report showed that more than half of meta-
Recovering mRNA from natural environments transcriptomics sequences from beech or spruce
is challenging because of mRNA instability and the forest soil were previously unknown sequences[82].
need to separate mRNAs from more abundant Among them, the researchers have identified a nov-
rRNAs and tRNAs. In addition, mRNAs of house- el family of oligopeptide transporters of fungi and
keeping genes are of little interest for many studies, provided experimental evidence that it is function-
thus removing them is also useful. These challenges al[83]. This success indicated the potential of meta-
http://journals.im.ac.cn/wswxtbcn
YANG Yun-Feng. Omics breakthroughs for environmental microbiology 25
transcriptomics in gene discovery and biotechno- tected. A subsequent study unveiled interpopulation
logical applications. recombination since strain-specific genome infor-
mation is available for the dominant species[88],
5 Metaproteomics which was confirmed by extensive
It is well understood that metagenomics and semi-quantitative protein profiling[89]. The virtue of
metatranscriptomics are not sufficient to understand proteogenomics was also demonstrated by the iden-
the activity and physiology of microbial communi- tification of 45%−65% of the predicted proteins of
ty, calling for the analysis of environmental samples the dominant species in acid mines[90]. Also, pro-
at the protein level. Metaproteomics, which analyz- teogenomics was carried out in study microbial
es protein profiles of microbial communities, can community of plant leaves[91], leading to the identi-
address the function more directly than metage- fication of protein profiling o predominant species
of microbial community (Sphingomonas, Methylo-
nomics and metatranscriptomics since proteins are
bacterium and Pseudomonas). Similarly, microbial
responsible for performing function and activities.
community in the termite hindgut or sheep rumen
To date, there are a variety of applications such as
was recently analyzed by proteogenomics[74,92],
protein cataloging, comparative and quantitative
which revealed the presence of a large number of
proteomics, post-translational modifications, pro-
cellulolytic and xylanolytic enzymes. Other pro-
tein-protein interactions and genotyping, allowing
teomics applications include a comparative meta-
for the establishment of linkages between gene and
proteome study to examine the mechanisms of en-
function in microbial communities.
hanced biological phosphorus removal (EBPR), a
A state-of-arts, standard metaproteomics expe- process of interest to a wastewater treatment
riment usually includes protein extraction and puri- plant[93−94]. The results indicated distinct differences
fication from an environmental sample, followed by in protein profiles and revealed metabolic pathways
enzymatic digestion of proteins, MS and MS/MS essential for EBPR in uncultured dominant micro-
analysis, and protein identification based on refer- bial species of the EBPR reactor. In highly complex
ence database searches. The reference protein data- environments, microbial communities in lake water,
bases are typically based on genomics or metage- soil or aquifier sediments were analyzed, entailing
nomics project. Thus, proteins cannot be resolved if the broad applicability of proteomics[95−97].
their nucleotide sequences are not available or an- However, several needs should be fulfilled for
notated. Conversely, proteomic data can contribute further development of proteomics. The protein ex-
to the refinement of sequencing annotations[84]. traction methods need to be developed to deal with
Thus, the combination of metaproteomics and me- organic and metal contaminants in environmental
tagenomics is very helpful in the analysis of envi- samples. The sensitivity and accuracy of mass spec-
ronmental samples. Indeed, a term proteogenomics trometer is also crucial for improvement in protein
has been coined to indicate the simultaneous per- coverage. Meanwhile, quantitative metaproteomics
formance of genomics and proteomics[85−86]. is urgently needed for comparisons between sam-
Although metaproteomics is still in its infancy, ples, in addition to the need of software tools to
there are excellent studies using large-scale overcome current challenges in data analysis. Suc-
MS/MS-based metaproteomics, or proteogenomics cesses in these issues will be important to advance
in some cases. A prominent example is the protein our understanding of earth‘s biogeochemical cycles
profiling of natural biofilms in an acid mine drai- in which microbial communities play important
nage[87], which are low complexity models due to roles and harness proteins with biological activities
the extreme conditions. More than 2 000 proteins desirable for biotechnological applications.
from the top five abundant species were sampled to
saturation (up to 48% protein coverage), whereas as
6 Community metabolomics
low as 1% of the total population can also be de- Community metabolomics is the large-scale
http://journals.im.ac.cn/wswxtbcn
26 微生物学通报 Microbiol. China 2013, Vol.40, No.1
metabolite profiling in a given environmental sam- of gut microflora was observed on mammalian
ple. Metabolites, which are small chemical mole- blood metabolites, suggesting that microbes directly
cules, are required for the building blocks, energy affected the metabolism capacity of human
consumption and signal communication of microbi- body[103]. This finding was of great interest for ex-
al community, providing a direct description of the ploring microbiome toward improving human
physiological and communication status[98]. There- health. However, the application of metabolomics
fore, metabolite profiling is important for the com- for studying microbial community is still in an early
prehensive investigation of microbial community. stage. It is thereby important to develop more quan-
However, since metabolites vary in molecular titative tools, and integrate metabolomics with data
weight, size, solubility and other physical and sets from the other ―omics‖ to improve data quality
chemical properties, it imposes a challenge on and provide novel insights than a single ―omics‖
comprehensive detection and measurement of me- technology alone can offer.
tabolites. In addition, as other omics approaches,
7 Bioinformatics
another challenge is to extract and interpret the vast
amount of information from community metabo- Omics typically produce very large datasets
lomics, entailing the need for bioinformatics tool covering a large number of previously undescribed
development and validation. genomes. Therefore, a major challenge in omics
Community metabolomics utilizes analytic applications in environmental microbiology lies in
chemistry tools, such as nuclear magnetic resonance handling and interpretation of high-density data.
spectroscopy and chromatographic separation tech- Bioinformatics is the application of informatics
niques coupled to mass spectrometry[99]. Gas chro- tools to biological data, including high-throughput
matography coupled to mass spectrometry (GC-MS), sequence quality scoring, alignment, assembly,
which targets volatile metabolites either naturally or computation of relative abundance, data access and
via chemical derivatisation, has been widely used comparison across various platforms.
for community metabolomics. It enables the profil- Raw data of high-throughput Omics experi-
ing of hundreds of low-molecular-weight metabo- ments typically contain a high level of inherent
lites of up to 1 000 Da, such as amino acids, fatty noise, which has a variety of context-dependent er-
acids and sugars, in a single run[100]. The coupled ror distributions. Bioinformatics will be important
mass analyzers include low-resolution quadrupole to resolve this issue. However, to date there is no
instruments, or fast-scanning time of flight (TOF) appropriate software that can assemble de novo ge-
MS. In contrast, liquid chromatography coupled to nomes from short reads of less than 50 bp[18].
mass spectrometry (LC-MS) can detect and quanti- Therefore, reconstruction of microbial genomes
fy higher molecular mass metabolites and do not from highly diverse microbial community is not
require derivatization. Another metabolomics tech- available. Given the non-exhaustive nature of omics
nique, nuclear magnetic resonance (NMR), is also dataset, sequence assembly will be of limited bene-
popular since it detects a wide range of metabo- fit and prone to high errors of chimeric assembly.
lites[101] and is fast, cheap, noninvasive and quantit- Regardless, there has been recent effort to develop
ative. However, the sensitivity of NMR is relatively new assembly tools[105−107].
low. As any of these techniques has limitations in There appears to be a huge need of sequence
detecting the huge metabolic diversity and inherent alignment, annotation and function prediction by
analytical errors, a multifaceted approach with comparing DNA reads against previously known
combinations of GC-MS, LC-MS and NMR should gene products and exploring the taxonomical con-
be considered. tent[108−110]. Notably, the predictive power of the
There have been several studies of community sequence annotation is limited by known gene func-
metabolomics[102−104]. For example, a strong effect tions in public databases, thus the utility of se-
http://journals.im.ac.cn/wswxtbcn
YANG Yun-Feng. Omics breakthroughs for environmental microbiology 27
quence annotation is affected by bias caused by in- classification and metabolic pathway reconstruction
accurate annotation in the databases. BLAST (Basic of metagenomics sequences. mg-RAST, the ab-
Local Alignment Search Tool)[111] is an essential breviation of the metagenomics RAST server[122],
program to compare sequence to NCBI DNA and has been used to reconstruct metabolic pathways of
protein databases. Until recently, CLUSTALW has several different microbial communities[123].
been the most widely used program for sequence Meanwhile, the stand alone program MEGAN can
alignment[112], but several improved alternatives, rapidly provide phylogenetic classification, func-
such as T-COFFEE and PROBCONS, have been tional assessments and visual outputs[109].
developed[113]. However, BLAST is not adequate for The importance of setting up consensus guide-
handling short-read sequences, calling for new al- lines of metadata MIGS (Minimum Information
gorithms tailed for short-reads. An increasing num- about a Genome Sequence) has been accepted in the
ber of alignment algorithms have thus been devel- bioinformatics community, which might simulate
oped to fulfill the need[114]. MAQ37 and SHRiMP the most successful MIAME (Minimum Informa-
take sequence quality in account during sequence tion About a Microarray Experiment) guidelines for
alignment. Other algorithms use accelerated microarray experiments[124]. MIGS should include
processing to make gapped alignment more effi- environmental data of climate, land management
cient[114−115]. and environmental measurements and consistent
An emerging, exciting development of bioin- sequence reads format styles for multiple samples,
formatics is to generate co-occurrence patterns in which are useful for comparison of different meta-
association networks based on high density omics genomics projects and large-scale re-analysis with
data, which is robust to the typically high level of novel bioinformatics ideas.
inherent noise in omics data, and help unveil inte-
raction within microbial community and topological 8 Conclusions
details of microbial community structure[116]. Fur- In summary, the vast numbers and great diver-
thermore, environmental parameters can also be sity of microbial community, tethered with the he-
incorporated to establish linkages between microbi- terogeneity of the environment, impose major chal-
al community and natural environments. Using a lenges. Omics breakthroughs have greatly facili-
random matrix theory-based algorithm, association tated the understanding of the interactions within
networks were reconstructed from both sequencing- microbial communities and with their environment,
and microarray-based data to examine long-time albeit inherent biases for any of the approaches.
effects of elevated carbon dioxide[117−119]. The re- Metagenomics is the most mature and widely used
sults showed that the networks possessed general omics tool. Representing functional potentials, me-
topological features of complex systems and ele- tagenomics reveals the range of potential activities
vated carbon dioxide clearly altered the network in microbial communities. Meanwhile, metatran-
interactions of microbial phylogenetic groups or scriptomics has been used to indicate the gene ac-
functional genes. Similarly, association networks tivity since mRNA is abundant only in live cells. In
were reconstructed from sequencing-based meta- contrast, the development and application of meta-
genomics data collected from ocean time-series site proteomics and community metabolomics are more
of southern California, revealing the dynamics of relevant in reflecting microbial physiology and in-
microbial communities over time and possible teraction within microbial communities and with
‗keystone‘ species[120−121]. environment. Although there are many difficulties
As most environmental microbiologists are not in extracting and quantifying protein and metabolite
professional bioinformaticians, it is necessary to components, there have been efforts in addressing
develop user-friendly automated omics ―pipelines‖ these issues and the coupling of genomic and pro-
that allow for automated annotation, phylogenetic teomic approaches[87,95].
http://journals.im.ac.cn/wswxtbcn
28 微生物学通报 Microbiol. China 2013, Vol.40, No.1
The use of omics for microbial communities is [8] Zengler K, Toledo G, Rappé M, et al. Cultivating
still in the infancy but likely to become more the uncultured[J]. Proceedings of the National
prominent when the cost falls and the necessary Academy of Sciences of the United States of
bioinformatics tools are in place. In addition, paral- American, 2002, 99(24): 15681−15686.
lel applications of metagenomics, metatranscrip- [9] Ingham CJ, Sprenkels A, Bomer J, et al. The
tomics, metaproteomics and community metabo- micro-Petri dish, a million-well growth chip for the
lomics will allow for simultaneous study of micro- culture and high-throughput screening of
bial community at different layers. These methods microorganisms[J]. Proceedings of the National
hold great promise in revealing the overall picture Academy of Sciences of the United States of
of microbial diversity and activity, as well as rare American, 2007, 104(46): 18217−18222.
spheres of previously unknown and uncharacterized [10] Ben-Dov E, Kramarsky-Winter E, Kushmaro A. An
microbes but possibly with functions that are im- in situ method for cultivating microorganisms using
portant for the community. Now it is only an expo- a double encapsulation technique[J]. FEMS
sition for an exciting era of environmental microbi- Microbiology Ecology, 2009, 68(3): 363−371.
ology. [11] Nunes da Rocha U, Van Overbeek L, Van Elsas JD.
Exploration of hitherto-uncultured bacteria from
参 考 文 献 the rhizosphere[J]. FEMS Microbiology Ecology,
2009, 69(3): 313−328.
[1] Hjort K, Bergström M, Adesina MF, et al. Chitinase
[12] Nichols D, Cahoon N, Trakhtenberg EM, et al. Use
genes revealed and compared in bacterial isolates,
of ichip for high-throughput in situ cultivation of
DNA extracts and a metagenomic library from a
―uncultivable‖ microbial species[J]. Applied and
phytopathogen-suppressive soil[J]. FEMS
Environmental Microbiology, 2010, 76(8):
Microbiology Ecology, 2010, 71(2): 197−207.
2445−2450.
[2] Liu N, Yan X, Zhang ML, et al. Microbiome of
[13] Hugenholtz P, Goebel BM, Pace NR. Impact of
fungus-growing termites: a new reservoir for
culture-independent studies on the emerging
lignocellulase genes[J]. Applied and Environmental
phylogenetic view of bacterial diversity[J]. Journal
Microbiology, 2011, 77(1): 48−56.
[3] Lombard N, Prestat E, Van Elsas JD, et al. of Bacteriology, 1998, 180(18): 4765−4774.
Soil-specific limitations for access and analysis of [14] Curtis TP, Sloan WT. Exploring microbial
soil microbial communities by metagenomics[J]. diversity—A vast below[J]. Science, 2005,
FEMS Microbiology Ecology, 2011, 78(1): 31−49. 309(5739): 1331−1333.
[4] Stockdale EA, Brookes PC. Detection and [15] Gans J, Wolinsky M, Dunbar J. Computational
quantification of the soil microbial improvements reveal great bacterial diversity and
biomass-impacts on the management of agricultural high metal toxicity in soil[J]. Science, 2005,
soils[J]. The Journal of Agricultural Science, 2006, 309(5739): 1387−1390.
144(4): 285−302. [16] Schloss PD, Handelsman J. Toward a census of
[5] Falkowski PG, Fenchel T, Delong EF. The microbial bacteria in soil[J]. PLoS Computational Biology,
engines that drive Earth‘s biogeochemical cycles[J]. 2006, 2(7): e92.
Science, 2008, 320(5879): 1034−1039. [17] Fierer N, Breitbart M, Nulton J, et al. Metagenomic
[6] Amann RI, Ludwig W, Schleifer KH. Phylogenetic and small-subunit rRNA analyses reveal the genetic
identification and in situ detection of individual diversity of bacteria, archaea, fungi, and viruses in
microbial cells without cultivation[J]. soil[J]. Applied and Environmental Microbiology,
Microbiological Review, 1995, 59(1): 143−169. 2007, 73(21): 7059−7066.
[7] Janssen PH. Dormant microbes: scouting ahead or [18] Wommack KE, Bhavsar J, Ravel J. Metagenomics:
plodding along?[J]. Nature, 2009, 458(7240): read length matters[J]. Applied and Environmental
831−831. Microbiology, 2008, 74(5): 1453−1463.
http://journals.im.ac.cn/wswxtbcn
YANG Yun-Feng. Omics breakthroughs for environmental microbiology 29
[19] Burgess JG, Jordan EM, Bregu M, et al. Microbial [29] Hudson ME. Sequencing breakthroughs for
antagonism: a neglected avenue of natural products genomic ecology and evolutionary biology[J].
research[J]. Progress in Industrial Microbiology, Molecular Ecology Resources, 2008, 8(1): 3−17.
1999, 35: 27−32. [30] Sogin ML, Morrison HG, Huber JA, et al. Microbial
[20] Garbeva P, de Boer W. Inter-specific interactions diversity in the deep sea and the underexplored
between carbon-limited soil bacteria affect behavior ―rare biosphere‖[J]. Proceedings of the National
and gene expression[J]. Microbial Ecology, 2009, Academy of Sciences of the United States of
58(1): 36−46. America, 2006, 103(32): 12115−12120.
[21] Sessitsch A, Weilharter A, Gerzabek MH, et al. [31] Roesch LFW, Fulthorpe RR, Riva A, et al.
Microbial population structures in soil particle size Pyrosequencing enumerates and contrasts soil
fractions of a long-term fertilizer field microbial diversity[J]. The ISME Journal, 2007,
experiment[J]. Applied and Environmental 1(4): 283−290.
Microbiology, 2001, 67(9): 4215−4224. [32] Breitbart M, Hoare A, Nitti A, et al. Metagenomic
[22] Hassink J, Bouwman L, Zwart KB, et al. and stable isotopic analyses of modern freshwater
Relationships between soil texture, physical microbialites in Cuatro Cienegas, Mexico[J].
protection of organic matter, soil biota, and C and N Environmental Microbiology, 2008, 11(1): 16−34.
mineralization in grassland soils[J]. Geoderma, [33] Murphy EF, Cotter PD, Healy S, et al. Composition
1993, 57(1/2): 105−128. and energy harvesting capacity of the gut
[23] Handelsman J, Rondon MR, Brady SF, et al. microbiota: relationship to diet, obesity and time in
Molecular biological access to the chemistry of mouse models[J]. Gut, 2010, 59(12): 1635−1642.
unknown soil microbes: a new frontier for natural [34] Rousk J, Bååth E, Brookes PC, et al. Soil bacterial
products[J]. Chemistry and Biology, 1998, 5(10): and fungal communities across a pH gradient in an
R245−R249. arable soil[J]. The ISME Journal, 2010, 4(10):
[24] Margulies M, Egholm M, Altman WE, et al. 1340−1351.
Genome sequencing in microfabricated [35] Nacke H, Thürmer A, Wollherr A, W et al.
high-density picolitre reactors[J]. Nature, 2005, Pyrosequencing-based assessment of bacterial
437(7057): 376−380. community structure along different management
[25] Adessi C, Matton G, Ayala G, et al. Solid phase types in German forest and grassland soils[J]. PLoS
DNA amplification: characterisation of primer One, 2011, 6(2): e17000.
attachment and amplification mechanisms[J]. [36] Serino M, Luche E, Gres S, et al. Metabolic
Nucleic Acids Research, 2000, 28(20): e87. adaptation to a high-fat diet is associated with a
[26] Fedurco M, Romieu A, Williams S, et al. BTA, a change in the gut microbiota[J]. Gut, 2012, 61(4):
novel reagent for DNA attachment on glass and 543−553.
efficient generation of solid-phase amplified DNA [37] Leininger S, Urich T, Schloter M, et al. Archaea
colonies[J]. Nucleic Acids Research, 2006, 34(3): predominate among ammonia-oxidizing
e22. prokaryotes in soils[J]. Nature, 2006, 442(7104):
[27] Mardis ER. The impact of next-generation 806−809.
sequencing technology on genetics[J]. Trends in [38] Mou XZ, Sun SL, Edwards RA, et al. Bacterial
Genetics, 2008, 24(3): 133−141. carbon processing by generalist species in the
[28] Claesson MJ, Wang Q, O'Sullivan O, et al. coastal ocean[J]. Nature, 2008, 451(7179):
Comparison of two next-generation sequencing 708−711.
technologies for resolving highly complex [39] Pegard A, Miquel C, Valentini A, et al. Universal
microbiota composition using tandem variable 16S DNA-based methods for assessing the diet of
rRNA gene regions[J]. Nucleic Acids Research, grazing livestock and wildlife from feces[J].
2010, 38(22): e200. Journal of Agricultural and Food Chemistry, 2009,
http://journals.im.ac.cn/wswxtbcn
30 微生物学通报 Microbiol. China 2013, Vol.40, No.1
http://journals.im.ac.cn/wswxtbcn
YANG Yun-Feng. Omics breakthroughs for environmental microbiology 31
[60] Metfies K, Medlin LK. Feasibility of transferring America, 2008, 105(22): 7768−7773.
fluorescent in situ hybridization probes to an 18S [70] Liu WZ, Wang AJ, Cheng SA, et al. Geochip-based
rRNA gene phylochip and mapping of signal functional gene analysis of anodophilic
intensities[J]. Applied and Environmental communities in microbial electrolysis cells under
Microbiology, 2008, 74(9): 2814−2821. different operational modes[J]. Environmental
[61] He ZL, Deng Y, Van Nostrand JD, et al. GeoChip Science and Technology, 2010, 44(19): 7729−7735.
3.0 as a high-throughput tool for analyzing [71] He ZL, Van Nostrand JD, Deng Y, et al.
microbial community composition, structure and Development and applications of functional gene
functional activity[J]. The ISME Journal, 2010, microarrays in the analysis of the functional
4(9): 1167−1179. diversity, composition, and structure of microbial
[62] Hazen TC, Dubinsky EA, DeSantis TZ, et al. communities[J]. Frontiers of Environmental Science
Deep-sea oil plume enriches indigenous and Engineering in China, 2011, 5(1): 1−20.
oil-degrading bacteria[J]. Science, 2010, 330(6001): [72] Adams MD, Kelley JM, Gocayne JD, et al.
204−208. Complementary DNA sequencing: expressed
[63] Beazley MJ, Martinez RJ, Rajan S, et al. Microbial sequence tags and human genome project[J].
community analysis of a coastal salt marsh affected Science, 1991, 252(5013): 1651−1656.
by the Deepwater Horizon Oil Spill[J]. PLoS One, [73] Toth AL, Varala K, Newman TC, et al. Wasp gene
2012, 7(7): e41305. expression supports an evolutionary link between
[64] Wakelin SA, Barratt BIP, Gerard E, et al. Shifts in maternal behavior and eusociality[J]. Science, 2007,
the phylogenetic structure and functional capacity 318(5849): 441−444.
of soil microbial communities follow alteration of [74] Warnecke F, Hess M. A perspective:
native tussock grassland ecosystems[J]. Soil metatranscriptomics as a tool for the discovery of
Biology and Biochemistry, 2012, doi: novel biocatalysts[J]. Journal of Biotechnology,
10.1016/j.soilbio.2012.07.003. 2009, 142(1): 91−95.
[65] Van Nostrand JD, Wu WM, Wu LY, et al. [75] Cheung F, Haas BJ, Goldberg SMD, et al.
GeoChip-based analysis of functional microbial Sequencing Medicago truncatula expressed
communities during the reoxidation of a bioreduced sequenced tags using 454 Life Sciences
uranium-contaminated aquifer[J]. Environmental technology[J]. BMC Genomics, 2006, 7(1): 272.
Microbiology, 2009, 11(10): 2611−2626. [76] Emrich SJ, Barbazuk WB, Li L, et al. Gene
[66] Lu Z, Deng Y, Van Nostrand JD, et al. Microbial discovery and annotation using LCM-454
gene functions enriched in the Deepwater Horizon transcriptome sequencing[J]. Genome Research,
deep-sea oil plume[J]. The ISME Journal, 2012, 2007, 17(1): 69−73.
6(2): 451−460. [77] Poretsky RS, Bano N, Buchan A, et al. Analysis of
[67] Zhou JZ, Xue K, Xie JP, et al. Microbial mediation microbial gene transcripts in environmental
of carbon-cycle feedbacks to climate warming[J]. samples[J]. Applied and Environmental
Nature Climate Change, 2012, 2(2): 106−110. Microbiology, 2005, 71(7): 4121−4126.
[68] Yang YF, Wu LW, Lin QY, et al. Responses of the [78] Frias-Lopez J, Shi YM, Tyson GW, et al. Microbial
functional structure of soil microbial community to community gene expression in ocean surface
livestock grazing in the Tibetan alpine grassland[J]. waters[J]. Proceedings of the National Academy of
Global Change Biology, Accepted, doi: 10.1111/ Sciences of the United States of America, 2008,
gcb.12065. 105(10): 3805−3810.
[69] Zhou JZ, Kang S, Schadt CW, et al. Spatial scaling [79] Bailly J, Fraissinet-Tachet L, Verner MC, et al. Soil
of functional gene diversity across various eukaryotic functional diversity, a metatranscriptomic
microbial taxa[J]. Proceedings of the National approach[J]. The ISME Journal, 2007, 1(7):
Academy of Sciences of the United States of 632−642.
http://journals.im.ac.cn/wswxtbcn
32 微生物学通报 Microbiol. China 2013, Vol.40, No.1
[80] Shi YM, Tyson GW, DeLong EF. Metatranscriptomics Community genomic and proteomic analyses of
reveals unique microbial small RNAs in the ocean‘s chemoautotrophic iron-oxidizing ―Leptospirillum
water column[J]. Nature, 2009, 459(7244): 266−269. rubarum‖ (group II) and ―Leptospirillum
[81] Poretsky RS, Hewson I, Sun SL, et al. Comparative ferrodiazotrophum‖(group III) bacteria in acid mine
day/night metatranscriptomic analysis of microbial drainage biofilms[J]. Applied and Environmental
communities in the North Pacific subtropical Microbiology, 2009, 75(3): 4599−4615.
gyre[J]. Environmental Microbiology, 2009, 11(6): [91] Delmotte N, Knief C, Chaffron S, et al. Community
1358−1375. proteogenomics reveals insights into the physiology
[82] Damon C, Lehembre F, Oger-Desfeux C, et al. of phyllosphere bacteria[J]. Proceedings of the
Metatranscriptomics reveals the diversity of genes National Academy of Sciences of the United States
expressed by eukaryotes in forest soils[J]. PLoS of America, 2009, 106(38): 16428−16433.
One, 2012, 7(1): e28967. [92] Toyoda A, Iio W, Mitsumori M, et al. Isolation and
[83] Damon C, Vallon L, Zimmermann S, et al. A novel identification of cellulose-binding proteins from
fungal family of oligopeptide transporters identified sheep rumen contents[J]. Applied and Environmental
by functional metatranscriptomics of soil Microbiology, 2009, 75(6): 1667−1673.
eukaryotes[J]. The ISME Journal, 2011, 5(12): [93] Wilmes P, Bond PL. The application of
1871−1880. two-dimensional polyacrylamide gel electrophoresis
[84] Armengaud J. A perfect genome annotation is and downstream analyses to a mixed community of
within reach with the proteomics and genomics prokaryotic microorganisms[J]. Environmental
alliance[J]. Current Opinion in Microbiology, 2009, Microbiology, 2004, 6(9): 911−920.
12(3): 292−300. [94] Martín HG, Ivanova N, Kunin V, et al.
[85] Wilkins MJ, VerBerkmoes NC, Williams KH, et al. Metagenomic analysis of two enhanced biological
Proteogenomic monitoring of Geobacter phosphorus removal (EBPR) sludge communities[J].
physiology during stimulated uranium Nature Biotechnology, 2006, 24(10): 1263−1269.
bioremediation[J]. Applied and Environmental [95] Schulze WX, Gleixner G, Kaiser K, et al. A
Microbiology, 2009, 75(20): 6591−6599. proteomic fingerprint of dissolved organic carbon
[86] Denef VJ, Kalnejais LH, Mueller RS, et al. and of soil particles[J]. Oecologia, 2005, 142(3):
Proteogenomic basis for ecological divergence of 335−343.
closely related bacteria in natural acidophilic [96] Benndorf D, Balcke GU, Harms H, et al. Functional
microbial communities[J]. Proceedings of the metaproteome analysis of protein extracts from
National Academy of Sciences of the United States contaminated soil and groundwater[J]. The ISME
of America, 2010, 107(6): 2383−2390. Journal, 2007, 1(3): 224−234.
[87] Ram RJ, VerBerkmoes NC, Thelen MP, et al. [97] Benndorf D, Vogt C, Jehmlich N, et al. Improving
Community proteomics of a natural microbial protein extraction and separation methods for
biofilm[J]. Science, 2005, 308(5730): 1915−1920. investigating the metaproteome of anaerobic
[88] Lo I, Denef VJ, VerBerkmoes NC, et al. benzene communities within sediments[J].
Strain-resolved community proteomics reveals Biodegradation, 2009, 20(6): 737−750.
recombining genomes of acidophilic bacteria[J]. [98] Gieger C, Geistlinger L, Altmaier E, et al. Genetics
Nature, 2007, 446(7135): 537−541. meets metabolomics: a genome-wide association
[89] Denef VJ, VerBerkmoes NC, Shah MB, et al. study of metabolite profiles in human serum[J].
Proteomics-inferred genome typing (PIGT) PLoS Genetics, 2008, 4(11): e1000282.
demonstrates inter-population recombination as a [99] Roessner U, Beckles DM. Metabolite
strategy for environmental adaptation[J]. measurements[A] // Schwender J. Plant Metabolic
Environmental Microbiology, 2008, 11(2): 313−325. Networks[M]. New York: Springer, 2009: 39−69.
[90] Goltsman DSA, Denef VJ, Singer SW, et al. [100] Jacobs A, Lunde C, Bacic A, et al. The impact of
http://journals.im.ac.cn/wswxtbcn
YANG Yun-Feng. Omics breakthroughs for environmental microbiology 33
http://journals.im.ac.cn/wswxtbcn