Escolar Documentos
Profissional Documentos
Cultura Documentos
Abstract
Systems biology is based on a mathematized understanding of molecular biological processes. Because genetic networks are so complex, a system understanding
is required that allows for the appropriate modelling of these complex networks
and its products up to the whole-cell scale. Since 2000 standardizations in modelling and simulation techniques have been established to support the community-wide endeavors for whole-cell simulations. The development of the Systems
Biology Markup Language (SBML), in particular, has helped systems biologists
achieve their goal. This paper explores the current developments of modelling and
simulation in systems biology. It discusses the question as to whether an appropriate system understanding has been developed yet, or whether advanced software
machineries of whole-cell simulations can compensate for the lack of system understanding.
Christopher Surridge (Ed.), Nature inside: Computational Biology, in: Nature 420,
2002, 205-250, here p. 205.
Cf. Masaru Tomita, Whole-cell Simulation: A Grand Challenge of the 21st Century,
in: TRENDS in Biotechnology 19, 6, 2001, pp. 205-210.
151
H. Andersen et al. (eds.), New Challenges to Philosophy of Science,
The Philosophy of Science in a European Perspective 4,
DOI 10.1007/978-94-007-5845-2_13, Springer Science+Business Media Dordrecht 2013
152
Gabriele Gramelsberger
amount of substances inside the cell as well as the gene expression resulting from
these changes, to study the temporal patterns of change, and, finally, to conduct
experiments with the SSC, e.g. real-time gene knock-out experiments. Thus,
[t]his simple cell model sometimes shows unpredictable behavior and has delivered biologically interesting surprises. When the extracellular glucose is drained and set to be zero,
intracellular ATP momentarily increases and then decreases []. At first, this finding was
confusing. Because ATP is synthesized only by the glycolysis pathway, it was assumed
that ATP would decrease when the glucose, the only source of energy, becomes zero. After
months of checking the simulation program and the cell model for errors, the conclusion is
that this observation is correct and a rapid deprivation of glucose supplement can lead to the
same phenomenon in living cells.3
The motivation of making use of simulation in biology is the desire to predict effects of changes in cell behavior and guide further research in the lab. As metabolic
networks are extremely complex systems involving dozens and hundreds of genes,
enzymes, and other species, it is difficult to study these networks experimentally.
Therefore, in-silico studies are increasingly expanding the wet lab studies, but this
requires the full range of strategies necessary to establish a simulation-orientated
biology. These strategies are standardization, acquisition of sufficient spatiotemporal information about processes and parameters, creation of advanced software
machineries, and, last but not least, a coherent system understanding.
2.Standardization
The situation of modeling and simulation in cell biology is characterized by a wide
variety of modeling practices and methods. There are thousands of simple models around, an increasing amount of simulations for more complex models, and
various computational tools to ease modeling. Each institute, each researcher creates his or her own model with slightly different concepts and meanings. Most of
these models are not comparable with each other, because each author may use a
different modeling environment (and model representation language), [therefore]
such model definitions are often not straightforward to examine, test and reuse.4
However, in 2000 this situation led to an effort to create an open and standardized
framework for modeling the Systems Biology Markup Language (SBML). The
collaborative work on SBML was motivated by the goal to overcome the current inability to exchange models between different simulation and analysis tools
[which] has its roots in the lack of a common format for describing models.5
3
4
5
153
6
7
8
9
standard based on open workshops and an editorial team for updates, which is elected
to a 3-year non-renewable term. (Cf. http://sbml.org/).
Nature Precedings: The Systems Biology Markup Language (SBML) Collection, (accessed on 6 January 2012). URL: http://precedings.nature.com/collections/sbml. Another description language for modeling is CellML.
The example, and its notations, is taken from the initial SBML paper. Cf. Hucka et al.,
2003, loc. cit., p. 526 ff.
Ibid., p. 528.
Ibid., p. 528.
154
Gabriele Gramelsberger
tity relationships and activity flows, creating three different types of diagrams.
Furthermore, new software applications, like CellDesigner and CellPublisher, are
now built on both SBML and SBGN.
3. Quantitative data
Another important basis of preparing the stage for a simulation-orientated biology
is the acquisition of quantitative data as most simulation methods in biology are
based on differential equations, which describe the temporal development of a systems behavior. However, most data available in biology are qualitative data representing the functions of genes, pathway maps, protein interaction, etc. But for
simulation quantitative data (such as concentrations of metabolites and enzymes,
flux rates, kinetic parameters and dissociation constants) are needed. A major challenge is to develop high-throughput technologies for measurement of inner-cellular metabolites.10 Furthermore, these quantitative data have to be fine-grained.
[T]raditional biological experiments tend to measure only the change before and
after a certain event. For computational analysis, data measured at a constant time
interval are essential in addition to traditional sampling points.11 However, quantitative measurements of inner-cellular metabolites, protein synthesis, gene expression, etc. in fine-grained time series experiments are challenging experimental
biology. For instance, it is assumed that eukaryotic organisms contain between
4,000 and 20,000 metabolites. Unlike proteins or RNA, the physical and chemical
properties of these metabolites are highly divergent and there is a high proportion
of unknown analytes that is measured in metabolite profiling. Typically, in current
[Gas-chromatographymass-spectrometry] GCMS-based metabolite profiling, a
chemical structure has been unambiguously assigned in only 2030% of the analytes detected.12 Nevertheless, a typical GC-MS profile of one sample contains
300 to 500 analytes and generates a 20-megabyte data file. Quantitative data for
metabolite profiles as well as for transcript and protein profiling are usually expressed as ratios of a control sample. In addition, absolute quantification is important for understanding metabolic networks, as it is necessary for the calculation
of atomic balances or for using kinetic properties of enzymes to develop predictive
models.13 For both types of quantitative data the changes in the samples, even for
tiny influences, can be huge and it is difficult to achieve meaningful and reliable
10 Tomita 2001, loc. cit., here p. 210. Cf. Jrg Stelling, et al., Towards a Virtual Biological Laboratory, in: Hiroaki Kitano (Ed.), Foundations of Systems Biology. Cambridge
(Mass.): The MIT Press 2001, pp. 189-212.
11 Hiroaki Kitano, Systems Biology: Toward System-level Understanding of Biological
Systems, in: Hiroaki Kitano, 2001, op. cit., pp. 1-38, here p. 6.
12 Alisdair R. Fernie, et al., Metabolite Profiling: From Diagnostics to Systems Biology, in: Nature Reviews Molecular Cell Biology 5, 2004, pp. 763-769, here p. 764.
13 Fernie et al., 2004, here p. 765.
155
results. Furthermore, these results do not come from fine-grained time series, let
alone cross-category measurements of metabolites, proteins and/or mRNA from
the same sample [] to assess connectivity across different molecular entities.14
However, fine-grained quantitative information is required for simulation in
biology. The answers to this challenge are manifold. One approach is to address
the measurement problem by creating new facilities, methods, and institutes. For
instance, in Japan a new institute has been set up for this new type of simulationorientated biology, [ which] consists of three centers for metabolome research,
bioinformatics, and genome engineering, respectively.15 Or alternatively, other
simulation methods for specific purposes have to be used to deal with the lack of
kinetic information, [] for instance flux balance analysis (FBA). 16 FBA does
not require dynamic data as it analyzes the capabilities of a reconstructed metabolic network on basis of systemic mass-balance and reaction capacity constraints.
Yet flux balance analysis does not give a unique solution for the flux distribution
only an optimal distribution can be inferred.17 Another alternative could be the
estimation of unmeasurable parameters and variables by models. So-called nonlinear state-space models have been developed recently for the indirect determination of unknown parameters from measurements of other quantities. These models
take advantage of knowledge that is hidden in the system, by training the models
to learn more about themselves.
4.Whole-cell simulations
Whatever option is chosen to tackle the problems that come along with a simulation-orientated biology, whole-cell simulations show great promise. On the one
hand they are needed for data integration,18 on the other hand the ultimate goal
14 Ibid., p. 768.
15 Tomita 2001, loc. cit., here p. 210.
16 J. S. Edwards, R. U. Ibarra, B. O. Palsson, In Silico Predictions of Escherichia coli
Metabolic Capabilities are Consistent with Experimental Data, in: Nature Biotechnology 19, 2001, pp. 125130, here p. 125.
17 As a result of the incomplete set of constraints on the metabolic network (that is,
kinetic constant constraints and gene expression constraints are not considered), FBA
does not yield a unique solution for the flux distribution. Rather, FBA provides a solution space that contains all the possible steady-state flux distributions that satisfy the
applied constraints. Subject to the imposed constraints, optimal metabolic flux distributions can be determined from the set of all allowable flux distributions using linear
programming (LP). (Edwards, Ibarra, Palsson, 2001, loc cit., here p. 125).
18 [] a crucial and obvious challenge is to determine how these, often disparate and
complex, details can explain the cellular process under investigation. The ideal way
to meet this challenge is to integrate and organize the data into a predictive model.
(Boris M. Slepchenko et al., Quantitative Cell Biology with the Virtual Cell, in:
TRENDS in Cell Biology 13, 11, 2003, pp. 570-576, here p. 570). Olaf Wolkenhauer
156
Gabriele Gramelsberger
19
20
21
22
23
24
and Ursula Klingmller have expanded the definition of systems biology given in the
first paragraph of this paper by adding the integration of data, obtained from experiments at various levels and associated with the omics family of technologies. (Olaf
Wolkenhauer, Ursula Klingmller, Systems Biology: From a Buzzword to a Life Science Approach, in: BIOforum Europe 4, 2004, pp. 22-23, here p. 22).
Tomita 2001, loc. cit., here p. 210.
Masaru Tomita, Towards Computer Aided Design (CAD) of Useful Microorganisms, in: Bioinformatics 17, 12, 2001a, pp. 1091-1092.
Kouichi Takahashi et al., Computational Challenges in Cell Simulation: A Software
Engineering Approach, in: IEEE Intelligent Systems 5, 2002, pp. 64-71, here p. 64.
Cf. E-Cell: Homepage, (accessed on 6 January 2012). URL: http://www.e-cell.org/
ecell.
Virtual Cell: Homepage at the Center for Cell Analysis & Modeling, (accessed on 6
January 2012). URL: http://www.nrcam.uchc.edu/.
Cf. Gabriele Gramelsberger, Johann Feichter, Modeling the Climate System, in: Gabriele Gramelsberger, Johann Feichter (Eds.), Climate Change and Policy. The Calculability of Climate Change and the Challenge of Uncertainty. Heidelberg: Springer
2011, p. 44 ff.
157
However, the aim of this simulation approach is the creation and distribution of complex models. These collaboratively advanced software machineries are
synthesizers for interconnecting all kinds of computational schemes and strategies. Thus, complex systems are built in a bottom-up process by innumerous
computational schemes. E-Cell, for example, combines object-oriented modeling
for DNA replication, Boolean networks and stochastic algorithms for gene expression, differential-algebraic equations (DAEs)25 and FBA for metabolic pathways,
SDEs and ODEs for other cellular processes (see Tab. 1).26 These advanced integrative cell simulations provide in-silico experimental devices for hypothesis testing and predictions, but also for data integration and the engineering of de-novo
cells. In such an in-silico experimental device genes can be turned on and off, substance concentrations and metabolites can be changed, etc. Observing the behavior
of the in-silico cell yields comparative insights and can lead to the discovery of
causalities and interdependencies. Thus, computers have proven to be invaluable
in analyzing these systems, and many biologists are turning to the keyboard.27
Researchers involved in the development of the whole-cell simulator E-Cell have
outlined a computational cell biology research cycle from wet experiments forming cellular data and hypotheses, to qualitative and quantitative modeling, to programming and simulation runs, to analysis and interpretation of the results, and
back to wet experiments for evaluation purposes.28 However, this is still a vision as
biologists rarely have sufficient training in the mathematics and physics required
to build quantitative models, [therefore] modeling has been largely the purview of
theoreticians who have the appropriate training but little experience in the laboratory. This disconnection to the laboratory has limited the impact of mathematical
modeling in cell biology and, in some quarters, has even given modeling a poor
reputation.29 In particular Virtual Cell tries to overcome this by offering an intuitive modeling workspace that is abstracting and automating the mathematical and
physical operations involved in constructing models and generating simulations
from them.30
25 A DAE combines one ordinary differential equation (ODE) for each enzyme reaction,
a stochimetric matrix, and algebraic equations for constraining the system.
26 Cf. Takahashi et al., 2002, loc. cit., p. 66 ff.
27 Ibid., p. 64.
28 Cf. Ibid., p. 64 ff.
29 Leslie M. Loew, et al., The Virtual Cell Project, in: Systems Biomedicine, 2010, pp.
273-288, here p. 274.
30 Loew, et al., 2010, loc. cit., here p. 274.
158
Gabriele Gramelsberger
5.System understanding
However, the core of systems biology is a proper system understanding, which can
be articulated mathematically and modeled algorithmically. As the systems biologist Hiroaki Kitano pointed out in 2001, systems biology aims at understanding
biological systems at system level,31 referring to forerunners like Norbert Wiener
and Ludwig von Bertalanffy. For Bertalanffy a system can be defined as a complex of interacting elements.32 However, as biology uses mathematical tools and
concepts developed in physics, it has to be asked whether the applied computational schemes suit biological systems. The problem with the concept of complex
systems resulting from physics is twofold. On the one hand physical complex
systems are characterized by elements which are the same within and outside the
complex; they may therefore be obtained by means of summation of characteristics and behavior of elements as known in isolation (summative characteristic).33
On the other hand their concept of interaction exhibits an unorganized complexity,
while the main characteristic of biological systems is their organized complexity.
31 Kitano 2001, op cit., here p. xiii referring to Norbert Wiener: Cybernetics or Control
and Communication in the Animal and the Machine. New York: John Wiley & Sons
1948 and Ludwig von Bertalanffy, General System Theory. Foundations, Development, Applications. New York: Braziller 1968.
32 von Bertalanffy, 1968, op. cit., here p. 55.
33 Hiroaki Kitano, Computational Systems Biology, in: Nature 420, 2002, pp. 206-210,
here p. 54.
159
Therefore, Kitano calls biological systems symbiotic systems exhibiting coherent rather than complex behavior:
It is often said that biological systems, such as cells, are complex systems. A popular
notion of complex systems is of very large numbers of simple and identical elements interacting to produce complex behaviours. The reality of biological systems is somewhat
different. Here large numbers of functionally diverse, and frequently multifunctional, sets
of elements interact selectively and nonlinearly to produce coherent rather than complex
behaviours.
Unlike complex systems of simple elements, in which functions emerge from the properties
of the networks they form rather than from any specific element, functions in biological
systems rely on a combination of the network and the specific elements involved. [] In
this way, biological systems might be better characterized as symbiotic systems.34
160
Gabriele Gramelsberger
model of the organism, unidirectional causality and closed systems; its novelty lies in the
introduction of concepts transcending conventional physics, especially those of information
theory. Ultimately, the pair is a modern expression of the ancient antithesis of process and
structure; it will eventually have to be solved dialectically in some new synthesis.37
Following Bertalanffys ideas, the living cell and organism is not a static pattern
or machine like structure consisting of more or less permanent building materials [, but] an open system.38 While in physics systems are usually conceived
as closed ones, sometimes expanded by additional terms for energy exchange
with their environment, biological systems are open in regard to energy and matter. Therefore biological systems show characteristic principles of behavior like
steady state, equifinality, and hierarchical organization. In the steady state, the
composition of the system remains constant in spite of continuous exchange of
components. Steady states or Fliessgleichgewicht are equifinal []; i.e., the same
time-independent state may be reached from different initial conditions and in different ways much in contrast to conventional physical systems where the equilibrium state is determined by initial conditions.39 This can lead to overshoots
and false starts as a response to unstable states and external stimuli (adaptation),
because biological systems tend towards a steady state. Based on this concept the
overshoot of intercellular ATP as exhibited by the virtual self-surviving cell (see
Sect. 1) can be easily explained, while for a physicist it sounds mysteriously.
However, the basic question is: Does simulation support this new system understanding beyond the physical concept of complex systems? Can biological systems modeled based on non-summative characteristics, meaning that a complex
is not built up step by step, by putting together the first separate elements; [
but by using] constitutive characteristics [] which are dependent on the specific
relations within the complex; for understanding such characteristics we therefore
must know not only the parts, but also their relations.40 The brief overview of
modeling practices with SBML has shown that each part (species, reaction, etc.)
has to be described explicitly. However, the important aspect is the interaction
between these parts (flux). Advanced software machineries allow complex interactions and feedbacks to be organized, e.g. feedbacks, loops, alternatives (jumps),
etc. Thus, the software machineries of whole-cell simulations function as synthesizers and can be seen as media for organizing complex relations and fluxes. As
already defined by Herman Goldstine and John von Neumann in 1946, coding
begins with the drawing of the flow diagrams.41 These flow diagrams specify the
37
38
39
40
41
161
FU Institute of Philosophy
Freie Universitt Berlin
Habelschwerdter Allee 30
D-14195, Berlin
Germany
gab@zedat.fu-berlin.de
sis, Oxford: Pergamon Press 1963, pp. 80-151, here p. 100. Cf. Gabriele Gramelsberger, From Computation with Experiments to Experiments with Computation, in:
Gabriele Gramelsberger (Ed.), From Science to Computational Sciences. Studies in the
History of Computing and its Influence on Todays Sciences. Zurich: Diaphanes, pp.
131-142.
42 Goldstine, Neumann, 1947, loc. cit., here pp. 81-82.
43 Interestingly, the object-oriented programming paradigm introduced in 1967 with
Simula 67 for physical simulations was advanced for the programming language
C++, which originated in the telecommunication industry (Bell labs) in response to the
demand for more complex structures for organizing network traffic. Cf. Terry Shinn,
When is Simulation a Research-Technology? Practices, Markets and Lingua Franca,
in: Johannes Lenhard, Gnter Kppers, Terry Shinn (Eds.), Simulation: Pragmatic
Construction of Reality. Dordrecht: Springer, 2006, pp. 187-203.