Você está na página 1de 2

Bioinformatics Advance Access published December 21, 2004

BIOINFORMATICS
expa: a program for calculating extreme pathways in biochemical reaction networks
Steven L. Bell, Bernhard . Palsson
Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA

ABSTRACT Summary: The set of extreme pathways, a generating set for all possible steady state ux maps in a biochemical reaction network, can be computed from the stoichiometric matrix, an incidence-like matrix reecting the network topology. Here, we describe the implementation of a well-known algorithm to compute these pathways and give a summary of the features of the available software. Availability: The C-code, along with a Windows executable and sample network reaction les, are available at http://systemsbiology.ucsd.edu Contact: palsson@ucsd.edu

DESCRIPTION OF THE ALGORITHM


Downloaded from http://bioinformatics.oxfordjournals.org/ by guest on April 19, 2013

INTRODUCTION

The abundance of genomic data available today allows for construction of genome-scale metabolic networks for many organisms. The topology of the type of networks considered here is determined by an m n stoichiometric matrix, S, whose rows and columns represent the systems metabolites and reactions, respectively. The dynamics of the system is (t) = Sv, where x is the m-dimensional vector of given by x metabolite concentrations, denotes time-derivative, and v is a vector of uxes which we assume is independent of concentrations and time. Under the assumption that the system is in steady-state, we have that Sv = 0, and to obtain biologically feasible solutions to this equation, we also impose the condition that v 0 (reversible reactions are split into two irreversible reactions one for each direction). The set of solutions satisfying these constraints is a so-called convex cone which can be generated by a nite (and unique up to a multiple) number of vectors, i.e., each biologically feasible ux vector (when the system is in steady state) can be expressed as a nonnegative linear combination of these extreme pathways (5). The extreme pathways are the edges of the convex cone, or more precisely, they are conically independent, i.e., no such vector can be expressed as a non-negative linear combination of any other vectors in the cone. The properties and uses of extreme pathways have been recently reviewed (2).
to

Given a metabolic network, where the metabolites are represented by the nodes and the edges represent the associated reactions, we compute the extreme pathways using an algorithm presented in (5) (see also (6)). The algorithm uses matrix operations similar to those used in the well-known Gaussian elimination algorithm. For the sake of brevity, we here give a simplied version of the extreme pathway algorithm. The algorithm may be described by a sequence of tableaux T 0 , T 1 , . . . , T m , where the initial tableau is given by T 0 = [I S ], and the nal tableau T m = [P 0 ]. Here, I is the n n identity matrix, prime denotes transpose, P is a matrix with n columns whose rows are the extreme pathways, and, 0 is a matrix with m columns and all entries equal to zero (determining the number of rows in the nal tableau is an open problem). Converting the right hand matrix S to the zero matrix is done column by column using elementary row operations, each tableaux corresponding to a column, as follows. For each 1 i m, the tableau T i is obtained from T i1 by rst choosing a pivoting column of the right hand matrix (originating from S ) to zero out, column j , say. Suppose there are pos positive, neg negative, and z zero elements in column j . First, the z rows of T i1 containing a zero in column j are copied to T i . Then each of the pos rows is combined (using an elementary row operation) with each of the neg rows to produce a i1 > 0 and zero in column j of T i . More precisely, if Ts,j i1 i1 i1 i1 Tt,j < 0 for some s and t, then |Tt,j | Ts + |Ts,j | Tti1 i1 is the new row to be added to T i . (Here, Ts denotes the sth i1 row, Ts,j is the (s, j )-element in the tableau T i1 , and |x| is the absolute value of x.) Finally, only rows which are conically independent are retained in T i : for any two rows x, y , if A(x) A(y ), then row x is deleted from the tableau, where A(x) = {i : xi = 0}, the indices of the zero components of x. Hence, the number of rows in T i is at most z + neg pos.

IMPLEMENTATION

whom correspondence should be addressed

The tableaux are implemented as matrices (two-dimensional arrays) using pointers to pointers as described in Appendix B

Bioinformatics Oxford University Press 2004; all rights reserved.

Bell and Palsson

of (4). This means that the rows of the matrices are not necessarily stored in contiguous locations in memory (since rows are added and deleted each iteration, it would be inefcient to store the matrices in continuous chunks of memory). For each iteration i = 1, 2, . . . , m, two matrices are used, one containing the tableau from the previous iteration, T i1 , and the other for constructing the next tableau, T i . These matrices are in sparse form, i.e., only the non-zero elements are stored in memory. The column indices corresponding to the non-zero elements in each row of the matrices are accounted for by bit map representations of the matrices (similiar to the ones used in (3)). These bitmaps are also implemented as matrices of the type described above, but here each row consists of w words, where w = (m + n)/sizeof(word) , i.e., each row is represented by a pointer which points to a location in memory of size w words. For the following, assume (for simplicity) that the expression inside the ceiling operator of the denition of w is an integer. The conical independence check is done with AND logical bit operations using bit representations of the rows (if x and y are bit rows and ( x[i]) & y [i] is nonzero, for some 0 i w 1 then A(x) A(y ), where denotes bit negation). Each of the candidate pos neg rows (if and are rows of opposite signs, then a new row, , is constructed by doing = | , where | denotes bitwise OR and the operation is performed word by word) is checked against the existing rows of the next bit matrix, and conditional on the outcome of the test, its bit representation is added to the bit array and a corresponding sparse row is added to the next sparse matrix (at the start of the iteration the next matrix consist of the z zero rows determined by the current column). The pivoting column chosen in each iteration is one whose quantity pos neg is a minimum (of the columns not yet processed). This choice minimizes the amount of work done when constructing the next tableau and may be thought of as a local (or greedy) optimization strategy (an interesting open problem is how to choose pivoting columns based on some global optimization criterion see the poster from RECOMB04 on our web page for some numerical comparisons of different schemes for picking columns). The conical independence check described above is by far the most computationally intensive part of the algorithm. To descrease the number of checks necessary, it may be possible to partition the rows into equivalence classes based on the zero index sets A() so that only a subset of the neg pos rows need be checked for a particular candidate row (3). Such a partition is, in general, network dependendent, and the number of classes must be large enough to outweigh the additional expense of implementing the partitioning scheme. We have, as of yet, not found such a scheme. There have been other implementations of the extreme pathway algorithm described above (see for example (1) and

(3), but to our knowledge none where the source-code has been freely available.

FEATURES

The open-source software comes with a command line interface, where the user is provided with input options and help menu by typing expa with no arguments or expa -h, respectively. To calculate the extreme pathways, the user has the option of specifying the network topology in the form of the stoichiometric matrix or the corresponding metabolic reactions. The matrix le is an ascii le where each row of the matrix constitutes a line (ended by a new line character) and each matrix entry is separated by white space. In addition, the user must supply the dimensions of the matrix and the number and types of the so-called exchange uxes (see (5)). Being able to use the stoichimetric matrix as input is useful if preprocessing of the matrix is desired (for example, permuting or removing columns). The reaction le is also an ascii le and contains all the reactions in the metabolic network, one on each line of the le. The exact form of the entries is described in the README le on our website, and several sample les are provided there as well. The extreme pathways are output to a le named Paths.txt in matrix form, where each row is a pathway.

Downloaded from http://bioinformatics.oxfordjournals.org/ by guest on April 19, 2013

ACKNOWLEDGEMENT
Financial support for this work was provided by a grant from the National Institute of Health (GM68837).

REFERENCES
[1]Klamt, S., Stelling, J., Ginkel, M., Dieter, G. (2003) FluxAnalyzer: exploring structure, pathways, and ux distributions in metabolic networks on interactive ux maps. Bioinformatics 19(2): 261269. [2]Papin, J. A., Price, N. D., Wiback, S. J., Fell, D. A., and Palsson, B. . (2003) Metabolic pathways in the post-genome era. Trends in Biochemical Sciences 28:250258. [3]Samatova, F. N., Geist, A., Ostrouchov, G. and Melechko, A. (2002) Parallel out-of-core algorithm for genome-scale enumeration of metabolic systemic pathways. In: Proceedings of the First IEEE Workshop on High Performance Computational Biolology (HICOMB2002), Ft. Lauderdale, FL. [4]Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. (1992) Numerical Recipes in C, The Art of Scientic Computing, Second Edition, Cambridge University Press. [5]Schilling, C. H., Letscher, D. and Palsson, B. . (2000) Theory for the systemic denition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. Journal of Theoretical Biology 203:249283. [6]Schuster, R. and Schuster, S. (1993) Rened algorithm and computer program for calculating all non-negative uxes admissible in steady states of biochemical reaction systems with or without some ux rates xed. Computational and Applied Bioscience 9:7985.

Você também pode gostar