^{1}Department of Health Sciences, Liverpool Hope University, Taggart Avenue, Liverpool, L16 1JD, United Kingdom
^{2}Institute for Ageing and Health, Newcastle University, Ageing Research Laboratories, Campus for Ageing and Vitality, Newcastle upon Tyne, NE4 5PL, United Kingdom
^{3}Molecular Gastroenterology Research Group, Academic Unit of Surgical Oncology, Department of Oncology, University of Sheffield, Beech Hill Road, Sheffield, S10
2RX, United Kingdom
^{4}Institute for Global Food Security, Queen’s University Belfast, David Keir Building, Stranmillis Road, Belfast BT9 5AG, United Kingdom
^{5}Faculty of Health and Social Care, Edge Hill University, St Helens Road, Ormskirk, Lancashire L39 4QP, United Kingdom
Received date: June 18, 2013; Accepted date: September 30, 2013; Published date: October 07, 2013
Citation: Mc Auley MT, Proctor CJ, Corfe BM, Cuskelly GCJ, Mooney KM (2013) Nutrition Research and the Impact of Computational Systems Biology. J Comput Sci Syst Biol 6:271-285. doi:10.4172/jcsb.1000122
Copyright: © 2013 Mc Auley MT, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Computer Science & Systems Biology
The value of computational modelling in improving our understanding of complex nutrient-based pathways is becoming increasingly recognised. This is due to the integral role that computer modelling is playing within the multidisciplinary field of systems biology, where in silico quantitative simulations are being used to compliment more traditional wet-laboratory investigations. A large number of computational models are accessible via the Biomodels database, an archive of openly available peer reviewed models of biological systems. Moreover, there has been an explosion in the availability of free modelling software tools that can be used to assemble and simulate the dynamic behaviour of nutrient mediated systems. Computational modelling will continue to play an increasingly significant role in nutrition research. Thus, it is important that freely accessible models and resources relevant to nutrition research are highlighted. In response to these needs, we firstly examined the Biomodels database, to identify and categorise nutrition themed models. The outcome of the analysis revealed 163 nutrition themed models. These models are mainly cellular in nature, with intracellular representations of calcium oscillations the most common. Secondly, a generic nutrition centred modelling framework was used, to explore recent advances, data repositories and software relevant to model building. We conclude this paper by using our review findings to discuss areas of nutrition that could further exploit the potential of computational modelling in the future.
Computational modelling; Systems biology; Nutrition; Systems modelling
BN: Bayesian Network; Cell-ML: Cell Markup Language; GUI: Graphical User Interface; FBA: Flux Balance Analysis; FFM: Fat-Free Mass; MFA: Metabolic Flux Analysis; MPA: Metabolic Pathway Analysis; MPI: Message Passing Interface; ODE: Differential Equations; ROS: Reactive Oxidative Species; SBGN: Systems Biology Graphical Notation; SBML: Systems Biology Markup Language; SEDML: Simulation Experiment Description Markup Language
In general nutritionists attempt to comprehend and predict the physiological outcomes of complex metabolic pathways using conventional wet-laboratory techniques. However, it has been recognised that to appreciate fully the interactions involved in the nutrient-mediated circuitry of cells, tissues and organ systems, integrative approaches are needed [1]. Systems biology is an integrative approach to utilise and build on a myriad of quantitative information that has been generated from diverse sources [2] including genomics [3], metabolomics [4] and proteomics [5] data. It is anticipated this integrative approach will further elucidate both the molecular and biochemical interactions that take place within cells. Moreover, it is suggested that these findings will lead to an improved understanding of how cellular dynamics influences the behaviour of tissues and ultimately the health whole-organ systems [6]. The systems biology approach is having an impact on nutrition research presently, as computer models are being used to represent nutrient centred pathways. Computational systems models of nutrient centred pathways are utilized for a number of reasons. Firstly, models are capable of quantitatively describing and analysing how important molecular and biochemical components interact [7]. Secondly, nutrient-based biochemical systems contain diverse components that give rise to overlapping metabolic networks, comprising of numerous interactions [8-10]. Many of these interactions are non-linear in nature and can involve complex feedback and feed-forward loops [11-13]. Thus, it is challenging and even unfeasible to reason about these by human intuition alone. Computer modelling offers an alternative means of handling this complexity. Moreover, computational modelling is also beginning to facilitate the representation of nutrient-based systems on both a holistic and multiscale manner [14]. This contrasts with the reductionist approach that focuses on a particular dietary component and how this interacts with a single isolated metabolic system.
Traditional in vivo or in vitro techniques can also be limited when testing a hypothesis as such approaches can be resource-intensive, expensive, time consuming, impractical and potentially unethical. For instance, studies examining nutrient toxicity or deficiency often depend on animals [15,16] and recent guidelines suggest reducing the number of animals used in such experiments [17]. Computational systems modelling have been proposed as an alternative to using animals in these studies [17]. Finally, global demographic changes are placing an imperative on nutrition research to develop innovative strategies that improve healthy ageing [18,19]. Due to the timeframes involved when studying ageing, computational models are being used to study the long-term effects of diet [14]. Moreover, computer models are providing a way to represent age related pathologies, such as cardiovascular disease [20] and dementia [21].
As a result of the progress in nutrient centred systems modelling it is important that freely available models, approaches, software and data resources relevant to nutrient focused modelling are highlighted. As a result of these needs, the aim of this paper is to discuss the progress in systems modelling and its relevance to nutrition research. Firstly, we will briefly introduce the main modelling approaches. We then highlight openly accessible nutrient themed models archived within the Biomodels database http://www.ebi.ac.uk/biomodels-main/ a repository for published peer reviewed models [22]. These models have been encased within the System Biology Markup Language (SBML) http://sbml.org/Main_Page a format used for model exchange [23]. We recognise that several formats exist for exchanging computational models, including the Cell Mark-up language (Cell-ML) [24] and the recently developed simulation experiment description markup language (SED-ML) [25], which also supports SBML. However, the rationale for focusing on SBML is because it is currently the leading exchange format in Systems Biology. This is emphasized by the results of a search of PubMed and Biomodels for models or tools published in each year since SBML was launched in 2003. The search terms SBML/ systems biology markup language were used to search for publications archived in PubMed. Biomodels was then used to crosscheck for models not revealed by the PubMed search. Figure 1 provides a summary of this search, highlighting the annual growth in both SBML supported models and tools since 2003. In the final part of this review, a generic modelling framework is used to explore some recent advances, data repositories and software tools, relevant to this area. This contrasts with previous reviews in this field that discuss systems biology/modelling/ simulation approaches more broadly [1,26-29]. Our overarching goal is to emphasise the utility, accessibility and exchangeability of the computer models and resources to the nutrition community.
Underpinning all computational models is mathematics and several theoretical approaches can be used to represent nutrient-based systems [27,30]. In general the approach employed is dependent on the characteristics of the system to be modelled. A widely used technique in nutrition is stoichiometric network analysis, which includes metabolic flux analysis (MFA), flux balance analysis (FBA), and metabolic pathway analysis (MPA) [31]. Such approaches reconstruct metabolic networks and identify flux patterns. This type of compartmental modelling is routinely used for empirical studies. For example, recently MFA was used to integrate metabolic networks at the cellular scale into physiologically-based pharmacokinetic models at the whole-body level [32].
As flux models are not constructed using mechanistic biochemical kinetics, they are limited when it comes to predicting metabolic conditions. They are also solely based on model steady states and fluxes are inferred based on steady states. Another approach that is gaining momentum is agent-based modelling, a rule-based technique for simulating the interactions of the individual components of a system [33,34]. This method has been utilized recently to model brain cancer in response to glucose levels and micro-environmental factors such as oxygen [35]. The principal disadvantage of this approach is the challenges associated with studying the interconnectivity between the agent rules and the dynamics of the biological system. Boolean network models have also been utilised to model systems relevant to nutrition. Such models consist of a network of discrete variables. Generally they are used to represent gene networks with each Boolean variable representing a gene in an active or inactive state. A recent Boolean network model was used to represent intracellular cholesterol homeostasis as a Boolean vector with each coordinate of the system denoting a biological species of the pathway [36]. This approach is certainly worthwhile for appreciating the topology of gene networks; however this method is limited also as it does not deal with biological mechanisms or biochemical kinetics.
Probabilistic models are used if there is variability within the biological system. For example, stochastic differential equations can be utilized if a small number of molecules are suggested to be involved in discrete random collisions within a biochemical/molecular pathway [37-39]. Computationally this approach involves an algorithm treating each reaction in the model as a probability function e.g. biochemical reactions have different probabilities of taking place, which can be altered depending on the nature of the reaction. There is now considerable experimental evidence to support the application of this framework. For instance, stochastic models have been widely used when studying heterogeneity in gene expression [40,41]. Recent work by Carey et al. [42] demonstrated how stochastic modelling can be used to compliment traditional wet-lab techniques. This work investigated a transcription factor involved in zinc metabolism and used both computational modelling and mutations of specific gene promoter elements, to show that the molecular mechanisms of regulation can be inferred by quantifying how stochasticity alters with expression [42]. The main limitation of the stochastic approach is that stochastic models have a tendency to be computationally intensive particularly when the model contains a large number of molecules. Probability can also be represented with a Bayesian network (BN); a directed acyclic graph where each node represents a variable [43]. Nodes are in turn connected to a probability density function. Recently a (BN) approach was used to predict the contribution of fat-free mass (FFM) and fat mass to body weight within different population groups [44]. The authors suggest it is possible to use model predictions as a complementary body composition analysis for large populations [45]. A limitation of a number of BN approach is that certain BN approaches cannot feedback loops. Other examples of probabilistic approaches include stochastic Petri nets, and Boolean networks that incorporate probability [46]. The former is a directed bipartite graph, with two types of nodes, called places and transitions (represented diagrammatically by circles and rectangles respectively). Stochastic Petri nets control transitions with an exponentially distributed time delay [47]. Petri nets are widely used to study genetic regulatory networks [48]. Petri net variants including functional, hybrid, coloured, timed, continuous and hierarchical have also been utilised to represent a wide range of biological systems [48]. A major limitation of Petri nets from a biological perspective is that they are limited to relatively small networks. Therefore they are restricted when it comes to representing multi-scale systems; and as will be discussed later, scale is a challenge which is yet to be surmounted in computational systems biology. By far the most common approach to mechanistic modelling of nutrient based biochemical systems involves using deterministic simulations with ordinary differential equations (ODEs) [26]. These equations are known as ordinary as they depend on one independent variable only (time). This method creates a system of coupled ODEs. Coupled means the variables (biological species/ substrates) in the left hand side of the equation return in the right hand side of the same system (i.e. the variables depend on each other). A deterministic algorithm integrates the ODEs to produce a nonrandom solution [49]. The modelling of nutrient-centred biochemical pathways with the ODE approach has been extremely useful to the field of nutrition [50-57]. For instance, the folate cycle is a pertinent example of how nutrition has benefited from ODE systems modelling. ODE models have increased the understanding of purine biosynthesis [58], folate kinetics in human breast cancer cells [59] and also the response of intracellular levels of folate to vitamin B12 deficiency [60]. The latter model was also used to demonstrate the impact of genetic polymorphisms to the folate pathway and to demonstrate the sensitivity of the enzyme thymidylate synthesis to alterations in epithelial intracellular folate levels [61]. ODEs have a number of disadvantages, for example their continuous and deterministic nature make them unsuitable for representing stochasticity. Moreover as the complexity of biological increases, handling and manipulating them can be challenging.
Partial differential equations (multivariable functions with partial derivatives) are not as widely applied in biochemical modelling as ODE systems. A recent worthwhile example of their application in nutrition is a model by Tindal and colleagues [62]. This model explored the effect that nutrient and acidosis levels have on the distribution of proliferating and quiescent cells and dead cell material within a multicellular tumour spheroid [62]. Partial differential equations have also been routinely used to model lipoprotein dynamics [63]. A limitation of PDE’s is that like stochastic models they can be computationally intensive and therefore slow.
The Biomodels database stores peer-reviewed models that have been constructed using a variety of software platforms and then encoded in SBML. This makes the models exchangeable, accessible, open to manipulation and available for further modification. Firstly, all models archived in Biomodels were accessed to identify those with a nutrition based theme. As of the 1st of May 2013 Biomodels contained 924 models (n=436 curated and n=488 not curated). Identifying those with a nutrition theme involved individually examining each model. In total we identified 95 nutrition themed models in the curated section and 68 nutrition themed models in the non-curated section (Figure 2a). Computational models of nutrient pathways in humans, bacteria, yeast, and plants were identified (Figure 2b). The list was refined to Several of the models outlined in Table 1 are explored in more detail in the discussion. It is important to note that models may only be in the non-curated section temporarily until they are syntactically and semantically verified.
a) Calcium Based | Bio models ID |
---|---|
Ryanodine receptor adaptation and Ca^{2+}(-) induced Ca^{2+} release-dependent Ca^{2+} oscillations | BIOMD0000000060 |
Complex calcium oscillations and the role of mitochondria and cytosolic proteins | BIOMD0000000039 |
Complex intracellular calcium oscillations A theoretical exploration of possible mechanisms | BIOMD0000000043 |
Ca-independent phospholipase A2-dependent sustained Rho-kinase activation exhibits all-or-none response | BIOMD0000000088 |
A theoretical study of effects of cytosolic Ca^{2+} oscillations on activation of glycogen phosphorylase | BIOMD0000000100 |
Protein phosphorylation driven by intracellular calcium oscillations: a kinetic analysis | BIOMD0000000113 |
Hormone-induced calcium oscillations in liver cells can be explained by a simple one pool model. | BIOMD0000000114 |
Hormone-induced calcium oscillations in liver cells can be explained by a simple one pool model. | BIOMD0000000114 |
Hormone-induced calcium oscillations in liver cells can be explained by a simple one pool model. | BIOMD0000000115 |
Signal-induced Ca^{2+} oscillations: properties of a model based on Ca(^{2+})-induced Ca^{2+} release | BIOMD0000000117 |
A quantitative kinetic model for ATP-induced intracellular Ca^{2+} oscillations | BIOMD0000000145 |
Modeling and analysis of calcium signaling events leading to long-term depression in cerebellar Purkinje cells | BIOMD0000000162 |
A theoretical study on activation of transcription factor modulated by intracellular Ca^{2+} oscillations | BIOMD0000000166 |
A mathematical model of spontaneous calcium (II) oscillations in astrocytes | BIOMD0000000184 |
Dynamic simulation of the effect of calcium-release activated calcium channel on cytoplasmic Ca2+ oscillation | BIOMD0000000202 |
Calcium spiking | BIOMD0000000224 |
Switching from simple to complex oscillations in calcium signaling | BIOMD0000000329 |
On the encoding and decoding of calcium signals in hepatocytes | BIOMD0000000331 |
Complex intracellular calcium oscillations A theoretical exploration of possible mechanisms | BIOMD0000000043 |
A bifurcation analysis of two coupled calcium oscillators | BIOMD0000000058 |
Modeling of Ca^{2+} flux in pancreatic beta-cells: role of the plasma membrane and intracellular stores | |
Effects of extracellular calcium on electrical bursting and intracellular and luminal calcium oscillations in insulin | BIOMD0000000371 |
Parallel adaptive feedback enhances reliability of the Ca^{2+} signaling system | BIOMD0000000354 |
A role for calcium release-activated current (CRAC) in cholinergic modulation of electrical activity in pancreatic beta-cells | BIOMD0000000374 |
Modeling of bone formation and resorption mediated by parathyroid hormone: response to estrogen/PTH therapy. | BIOMD0000000274 |
A mathematical model of parathyroid hormone response to acute changes in plasma ionized calcium concentration in humans. | BIOMD0000000276 |
A dynamic model of interactions of Ca^{2+}, calmodulin, and catalytic subunits of Ca^{2+}/calmodulin-dependent protein kinase I | MODEL1001150000 |
Mathematical modelling of calcium wave propagation in mammalian airway epithelium: evidence for regenerative ATP release. | MODEL1006230018 |
Metabotropic receptor activation, desensitization and sequestration-I: modelling calcium and inositol 1,4,5-trisphosphate dynamics following receptor activation. | MODEL1006230039 |
A model of the single atrial cell: relation between calcium current and calcium release. | MODEL1006230070 |
Minimal model of beta-cell mitochondrial Ca^{2+} handling. | MODEL1201140004 |
The role of sodium-calcium exchange during the cardiac action potential. | MODEL1006230073 |
Riluzole-induced block of voltage-gated Na+ current and activation of BKCa channels in cultured differentiated human skeletal muscle cells. | MODEL7817907010 |
b) Energy Metabolism | Biomodels ID |
Creatine kinase in energy metabolic signalling in muscle | BIOMD0000000041 |
Mitochondrial energetic metabolism: a simplified model of TCA cycle with ATP production | BIOMD0000000232 |
Diethyl pyrocarbonate, a histidine-modifying agent, directly stimulates activity of ATP-sensitive potassium channels in pituitary GH(3) cells | BIOMD0000000124 |
Cooperation and competition in the evolution of ATP-producing pathway | BIOMD0000000337 |
The control systems structures of energy metabolism. | MODEL1006230010 |
An integrative dynamic model of brain energy metabolism using in vivo neurochemical measurements. | MODEL1006230041 |
Dynamics of muscle glycogenolysis modeled with pH time course computation and pH-dependent reaction equilibria and enzyme kinetics. | MODEL1006230049 |
Computer modeling of mitochondrial tricarboxylic acid cycle, oxidative phosphorylation, metabolite transport, and electrophysiology. | MODEL1006230090 |
Computational reconstruction of tissue-specific metabolic models: application to human liver metabolism. | MODEL1009150002 |
A metabolic model of the mitochondrion and its use in modelling diseases of the tricarboxylic acid cycle. | MODEL1106160000 |
A multi-tissue type genome-scale metabolic network for analysis of whole-body systems physiology. | MODEL1111070003 |
An old paper revisited: "a mathematical model of carbohydrate energy metabolism. Interaction between glycolysis, the Krebs cycle and the H-transporting shuttles at varying ATPases load" | MODEL1202170000 |
Reconstruction of Danio rerio metabolic model accounting for subcellular compartmentalisation. | MODEL1204120000 |
CardioNet: a human metabolic network suited for the study of cardiomyocyte metabolism. | MODEL1212040000 |
The Edinburgh human metabolic network reconstruction and its functional analysis. | MODEL2021729243 |
A biophysical model of the mitochondrial respiratory system and oxidative phosphorylation. | MODEL4151491057 |
Global reconstruction of the human metabolic network based on genomic and bibliomic data. | MODEL6399676120 |
A computational model for glycogenolysis in skeletal muscle. | MODEL6623617994 |
A community-driven global reconstruction of human metabolism. | MODEL1109130000 |
c) Protein Related | Biomodels ID |
Signaling switches and bistability arising from multisite phosphorylation in protein kinase cascades | BIOMD0000000026 |
GSK3 and p53 - is there a link in Alzheimer's disease? | BIOMD0000000286 |
Modelling the role of the hsp70/hsp90 system in the maintenance of protein homeostasis | BIOMD0000000344 |
An in silico model of the ubiquitin-proteasome system that incorporates normal homeostasis and age-related decline | BIOMD0000000105 |
Scaffold proteins may biphasically affect the levels of mitogen-activated protein kinase signaling and reduce its threshold properties | BIOMD0000000011 |
Genome-scale metabolic modeling elucidates the role of proliferative adaptation in causing the Warburg effect. | MODEL1105100000 |
d) Antioxidant/Reactive oxidant Species | Biomodels ID |
Alternative pathways as mechanism for the negative effects associated with overexpression of superoxide dismutase | BIOMD0000000108 |
Dynamic rerouting of the carbohydrate flux is key to counteracting oxidative stress | BIOMD0000000247 |
Mechanism of protection of peroxidase activity by oscillatory dynamics | BIOMD0000000046 |
A mathematical model of glutathione metabolism. | BIOMD0000000268 |
e) Glucose/Insulin | Biomodels ID |
A mathematical model of metabolic insulin signaling pathways | BIOMD0000000137 |
Systems-level interactions between insulin-EGF networks amplify mitogenic signaling | BIOMD0000000223 |
A model of phosphofructokinase and glycolytic oscillations in the pancreatic beta-cell. | BIOMD0000000225 |
A kinetic core model of the glucose-stimulated insulin secretion network of pancreatic beta cells | BIOMD0000000239 |
Mathematical modeling and analysis of insulin clearance in vivo | BIOMD0000000345 |
A model of beta-cell mass, insulin, and glucose kinetics: pathways to diabetes | BIOMD0000000341 |
Glucose sensing in the pancreatic beta cell: a computational systems analysis | BIOMD0000000348 |
Mass and information feedbacks through receptor endocytosis govern insulin signaling as revealed using a parameter-free modeling framework | BIOMD0000000343 |
A Hierarchical Whole-body Modeling Approach Elucidates the Link between in Vitro Insulin Signaling and in vivo Glucose Homeostasis | BIOMD0000000356 |
Modeling the insulin-glucose feedback system: the significance of pulsatile insulin secretion | BIOMD0000000372 |
Computer model for mechanisms underlying ultradian oscillations of insulin and glucose | BIOMD0000000382 |
Calcium and glycolysis mediate multiple bursting modes in pancreatic islets. | BIOMD0000000373 |
The phantom burster model for pancreatic beta-cells | BIOMD0000000377 |
Sustained oscillations in glycolysis: an experimental and theoretical study of chaotic and complex periodic behavior and of quenching of simple oscillations | BIOMD0000000042 |
Meal simulation model of the glucose-insulin system. | BIOMD0000000379 |
Harmonic oscillator model of the insulin and IGF1 receptors' allosteric binding and activation. | MODEL1112050000 |
A molecular mathematical model of glucose mobilization and uptake. | MODEL1112050001 |
The Feedback Control of Glucose: On the road to type II diabetes | MODEL1112110000 |
Pancreatic network control of glucagon secretion and counterregulation. | MODEL1112110002 |
Mathematical models of diabetes progression. | MODEL1112110003 |
An integrated model for glucose and insulin regulation in healthy volunteers and type 2 diabetic patients following intravenous glucose provocations. | MODEL1112110004 |
Effect of Na/Ca exchange on plateau fraction and [Ca]i in models for bursting in pancreatic beta-cells. | MODEL1201070000 |
Diffusion induced oscillatory insulin secretion. | MODEL1201140001 |
Modeling insulin kinetics: responses to a single oral glucose administration or ambulatory-fed conditions. | MODEL1201140002 |
Insulin receptor binding kinetics: modeling and simulation studies. | MODEL1201140005 |
Temporal coding of insulin action through multiplexing of the AKT pathway. | MODEL1204060000 |
Kinetic modeling of human hepatic glucose metabolism in type 2 diabetes mellitus predicts higher risk of hypoglycemic events in rigorous insulin therapy. | MODEL1209260000 |
Folate/Methionine Metablism | Biomodels ID |
A mathematical model of the folate cycle: new insights into folate homeostasis | BIOMD0000000213 |
Folate cycle kinetics in human breast cancer cells | BIOMD0000000018 |
A mathematical model of the methionine cycle. | MODEL1006230091 |
In silico experimentation with a model of hepatic mitochondrial folate metabolism. | MODEL1007200000 |
A mathematical model of the folate cycle: new insights into folate homeostasis. | MODEL6655501972 |
Lipid Metabolism | |
A whole-body mathematical model of cholesterol metabolism and its age-associated dysregulation. | BIOMD0000000434 |
An integrated model of eicosanoid metabolism and signaling based on lipidomics flux analysis. | BIOMD0000000436 |
Dynamical modeling of the cholesterol regulatory pathway with Boolean networks. | MODEL0568648427 |
Table 1: Human Nutrition themed models archived in Bio models*.
Until recently the time needed to learn computer programming and the mathematics associated with a model, made this discipline inaccessible to many nutritionists. However, in recent years significant advances have been made and many tools now come with a graphical user interface (GUI). Prior to using a modelling software tool a number of steps need to be negotiated. These steps are broadly outlined in Figure 3. The general approach outlined in Figure 3 assumes the use of Biomodels, and tools that support SBML, however Figure 3 could easily be adapted to suit alternative approaches. Using folate metabolism as a case study, each step in Figure 3a-3g can be negotiated to illustrate recent developments and standard systems modelling conventions.
a) This is specific to the system being modelled and model boundary points will be dictated by the hypothesis/idea. For instance, the question in Figure 3 is based on a publication by Nijhout and colleagues [60]. In this work the authors constructed a model to answer the question “what happens to folate metabolism when B12 is limited?” (Biomodels ID: BIOMD0000000213). b) Existing SBML encased models can be extended/modified and these can be searched for in Biomodels. There may also be non SBML encoded models, so it is also worthwhile conducting a broader search to establish if there is a possibility that a suitable non-SBML could be translated into SBML. c) The details of models can be explored within Biomodels including when and who published it. A decision can then be made on its suitability. d) If a suitable model is identified a software tool can be selected from the SBML matrix http://sbml.org/SBML_Software_Guide/ SBML_Software_Matrix. Due to the large number of available tools it was not possible to review all the tools in the SBML software matrix. Therefore priority was given to GUI based tools and non-commercial software due their suitability and accessibility for those with limited modelling experience. As of May 2013 the matrix contained the details of >232 software tools, capable of generating and supporting SBML. Although most tools share a common functionality, the user interfaces and the specific capabilities of different packages mean that selecting an appropriate tool to build a nutrient pathway model can be time consuming. Table 2 and Figure 4 provide the details of some of the SBML tools that were evaluated to alleviate this process. e) If there is no suitable model a new model can be developed (Figure 3 i-ix). f) A newly created model can be submitted to the Biomodels database where a curator will semantically verify model predictions and syntactically curate the model.
Tool | Outline, installation and resources |
---|---|
Athena | Description: A tool very well suited to those who are new to modelling. Athena is divided into three parts that can be viewed simultaneously. There is a canvas at the top of the screen where SBGN based species may be added. Species can then be connected via reaction arcs to create a network. A list of icons on the left hand side of the GUI list the different species that are available. These include entities to represent genes, promoters, compartments, and nucleic acids. To the right hand side there is a module viewer pane where species names and concentrations may be modified along with the value and names of rate constants. Immediately below the drawing canvas is a simulation pane that is exceptionally straightforward to use. It consists of a start button to initiate the simulator and a reset button. Simulation start and end times can also be conveniently set. Athena did not allow the SBML for the Nijhout model to be imported and it was not apparent why. Instead a simple two species reversible reaction was created and a simulation was run (Figure 6a). Installation: Full functionality of the tool requires installation of the Systems biology workbench http://sys-bio.org/. Once this is installed Athena may be downloaded and installed from (http://athena.codeplex.com/) and executed with a Windows executable file (SetupAthena.msi). Resources: A detailed guide to Athena is available at http://arxiv.org/pdf/0902.2598.pdf |
Cell Designer | Description: Developed by the Institute for Systems Biology in Japan (http://www.celldesigner.org/index.html) It has an intuitive user interface and in particular has a GUI that is suitable for visualizing existing models and for designing models diagrammatically in a ‘drag and drop’ fashion. The GUI was straightforward to use. There was no difficulty in exporting SBML models from it. It was also very capable of importing SBML models directly from the Bio models website; therefore we imported the model by Nijhout et al. One slight drawback was that the diagram associated with this model looked crowded and a little confusing when viewed using this software (Figure 6b). Cell Designer comes equipped with a simulation and parameter analysis suite. Results could be plotted or written to a file for further analyses using alternative software. Installation: A Windows executable file (CellDesigner-4.2-windows-installer.exe) was downloaded from (http://systems-biology.org/software/celldesigner/celldesigner-42.html). Desktop icon then was installed and doubled clicked to access the software. Resources: The cell designer web site contains a list of models specifically built using this software: http://www.celldesigner.org/models.html. Also, a comprehensive online tutorial is available: http://www.celldesigner.org/help/CDH_QT.html |
Copasi | Description: GUI based modelling tool. Capable of stochastic and deterministic modelling. Divided into two components: model building/simulation, model analysis/output. Copasi offers a number of analysis techniques including, sensitivity analysis for exploring changes to the values of species concentrations and for examining changes to rate constants and rate laws. It is also able to detect steady states of models and has an optimization facility where the model can be fitted to time course data. Copasi was found to be a very suitable tool for those that are new to modelling. All tasks were carried out on GUI which is well laid out and intuitive. As a result we used it to both construct our own folate model and for testing the predictive capabilities of the folate model by Nijhout et al. Installation: straight forward download from the Copasi Web site (http://www.copasi.org/tikiindex.php?page=downloadnoncommercial). This was followed by double clicking on the downloaded Microsoft Windows executable file (Copasi-35-Win32.exe). An icon was then displayed on the desktop of the computer, which was double clicked each time access to the software was needed. Resources: A comprehensive guide to the functionality of Copasi has been published by the developers and interested readers are referred to it [81]. Users should also access the supporting documentation and video clips on the Copasi web site (http://www.copasi.org/tiki-view_articles.php) |
JDesigner |
Well suited to a nutritionist who is looking for a straightforward introduction to computational modelling. GUI based tool, which allows models to be constructed on a canvas with nodes used to represent the species and reaction arcs connecting them. Species are selected from a horizontal network bar. The interface is well laid out with all the icons visible and intuitive (e.g. metabolites, proteins, nucleic acids). After creating a simple model, a simulation was run using an additional piece of software known as Jarnac. Rate laws and parameters were entered via dialog boxes, which appeared after clicking on a reaction arc in the diagram. They also allowed rate expressions not included to be defined. One slight disadvantage that was found when using this tool was that diagrams became crowded and a little difficult to understand. Therefore care needs to be taken with species labelling etc. It was very straightforward to import the model by Nijhout et al. to JDesigner. However there did not appear to be a direct way to do this from Bio models. Rather the SBML opened directly with JDesigner. As a result the model was displayed on the canvas (Figure 6d). Installation: A windows executable file (SBW-2.9.0-win32-installer.exe (66.6 MB)) was downloaded from (http://sbw.kgi.edu/software/jdesigner.htm) JDesigner is installed with other software including Jarnac. One slight issue, it was not apparent immediately that JDesigner had been installed successfully and we had to do a search for the software. Resources: The JDesigner home page contains a link to a PDF that contains further instructions on how to use JDesigner (http://sbw.kgi.edu/sbwWiki/sysbio/jdesigner). |
Table 2: Partial list of modelling and simulation software that supports SBML.
Figure 4: Different modelling tools a) Building a model with the software tool Athena. Top panel illustrates a reversible reaction between folate and tetrahydrofolate. Bottom panel shows a simulation of these two species. b) Graphical representation of the Nijhout et al model (2004) after importation of the SBML code to Cell Designer C) Representation and simulation of the Nijhout model in Copasi d) Graphical representation of the Nijhout et al. [59] model with the modelling tool JDesigner.
i) Developing a list of species and how they react is common place when model building. Example reactions for the folate cycle are outlined in Figure 5. ii) A variety of approaches can be used to create a network diagram. Historically, conventional pathway diagrams were adjusted to create a network diagram (Figure 6a, also See Nijhout et al. [50] for an example of this for the folate cycle). However, a less ambiguous way to create a network diagram is to use Petri net notation. Even if the mathematical underpinning of the model is not Petri net in nature, this notation provides an excellent means of visualizing a model. The folate cycle is represented with Petri net notation in Figure 6b. A more recent approach that attempts to standardize how network diagrams are represented is Systems biology graphical notation (SBGN) [64]. SBGN aims to facilitate the representation of diagrams in a clear and unambiguous fashion and is supported by a variety of software packages. For instance we used VANTED [65], a tool for visualization and analysis of networks containing experimental data to create the network diagram of folate metabolism in Figure 6c using SBGN. iii) As in the main models are ODE/deterministic or stochastic these mathematical approaches are worthwhile discussing in more detail. The ODEs framework makes the assumption that species exist in a well-mixed compartment and that their concentrations can be treated as continuous [49]. It also assumes that large numbers of molecules are involved in reactions and that the average behaviour of the population is not influenced by individual fluctuations. To illustrate this, the entry of folate (F) into the cell at a rate k1 can be given by ODE 1 (Figure 5). Provided the medium is homogenous the reaction in ODE 1 will proceed at a rate directly proportional to the concentration of F. This is known as mass action kinetics [66]. Generally ODEs are more detailed than ODE 1 and reflect how species regulate each other based on their biochemical dynamics. For instance, Michaelis-Menten kinetics [67] describe the rate of many enzyme mediated reactions, thus they are routinely incorporated into rate laws [68]. In addition to Michaelis- Menten kinetics, functions to represent negative feedback, inhibition and homeostatic mechanisms can be incorporated into rate laws [69]. Eventually a system of coupled ODEs is created to represent the system. An elementary example of this for the folate system is illustrated in Figure 5.
Deterministic models do not consider molecular spatial heterogeneity or account for discrete random collisions between individual molecules, for instance, if the molecules exist in small numbers or there are random fluctuations in their behaviour. Stochastic simulations attempt to overcome these issues by treating reactions as random processes. Generally stochastic models are simulated with the Gillespie algorithm [70] or one of its derivatives [71-73]. GUI based tools hide the mathematics underpinning these algorithms; however it is important to appreciate its underlying principles. Briefly, reactions are treated as probability/propensity functions rather than ODEs. The essence of this is that it avoids dealing with average behaviour. Rather the probabilistic formulation calculates firstly when the next reaction occurs and secondly what reaction it will be. Such functions give the probability aμ of reaction μ occurring in time interval (t, t + dt). In a reaction system with M reactions, reactions are given an arbitrary index μ (1 ≤ μ ≤ M). A reaction is represented as aμdt=hμcdt, where hμ is the number of possible combinations of reactant molecules involved in reaction μ and c is a stochastic rate constant. To illustrate this process Table 3 describes a reaction between 5-methyltetrahydrofolate and homocysteine, while Figure 7 outlines the steps involved in a stochastic simulation. iv) See section d above). v) This stage involves setting the initial concentrations of the various biological species. This invariably involves consulting the published literature to identify initial concentration/numbers of the various species. It is also crucial to isolate kinetic data of the enzymes that are included in the model to help parameterize the model. One way to do this is to use an online database such as BRENDA (http://www.brenda-enzymes.info/) [74]. The focal point of BRENDA is a section that details almost 80,000 different enzyme catalysed reactions. It includes a wide variety of kinetic parameters including Km and Kcat values. BRENDA recently incorporated an SBML format for kinetic data [74]. Further recent advances have also witnessed the release of SABIO-RK (http://sabio.h-its.org/) a web-accessible database which stores kinetic parameters measured under defined assay conditions [75]. Kinetic parameters in SABIO-RK include almost 30000 rate constants (Vmax, kcat,), and >30 900 Km values. This database also supports export in to the SBML format for modelling purposes [75]. Simulations also benefit from estimates of initial concentrations (intracellular or intra-mitochondrial) of metabolites and indeed relative abundance of the number of enzyme molecules in a system. The Bio numbers website www.bionumbers.org contains a partial repository of such information and can be used to contribute to the parameterisation of starting states of models, or estimation of realistic dynamic ranges if capacity constraints are built into such models [76]. Table 4 presents a partial list of online resources. In addition to online resources, recent efforts in systems biology have focused on devising suitable methods to kinetically characterize purified enzymes. Adamczyk and colleagues outline 5 ways to measure enzyme activity based on source, the type of assay medium, and its purpose [77]. vi) The purpose of simulation is to explore the dynamic behaviour of the system. Figures 5 and 7 illustrate outputs from deterministic and stochastic simulations respectively. Deterministic graphs have a smooth continuous profile, which remains the same as long as the initial concentrations and parameter set does not change. With the stochastic simulation this is not the case, as given the same initial parameter set and concentrations, the behaviour of the species varies over time. vii) If appropriate time course data is available, the behaviour of the model can be compared to this. If the model does not compare well to known time course dynamics then parameters and concentrations need to be adjusted (return to step v). Alternatively tools such as Copasi are capable of ‘fitting’ the model to experimental data [78-80]. Additionally, tools have recently been developed such as a parallel parameter estimator that uses the message passing interface (MPI) protocol to speed up parameter estimation (SBML-PET-MPI) http://www.bioss.uni-freiburg.de/cms/sbml-pet-mpi.html. This tool is capable of data fitting using several data sets, thus partly alleviating the computational burden associated with parameter estimation [81]. More recently, Adams and colleagues have developed the Systems Biology Software Infrastructure (SBSI), to aid parameter fitting http://www.sbsi.ed.ac.uk/. It contains three main components, a library devoted to parameter fitting, a tracking and job submission section and an extensible client application used to establish optimization and to display results [82]. Bayesian inference techniques have also been utilised for parameter estimation. For instance, GNU MCSim is a simulation package, which permits model building, stochastic simulation and Bayesian inference via Markov Chain Monte Carlo simulations http://www.gnu.org/software/mcsim/ [83].
aµdt = hµcµdt = average probability that a reaction R_{µ} will occur in the next time interval dt C_{µ}dt = average probability that two reactant molecules will collide according to R_{µ} in the time interval dt h_{µ} = total number of possible reaction combinations e.g: Homocysteine conversion to metheonine is reaction six (R_{6}): R_{6}: Five MTHF+Hcy→THF+Met Assuming there is Z molecules of five MTHF & X molecules of Hcy: h1 = X*Z = number of possible reaction combinations X*Z* C_{6}dt = Probability that R_{6} will occur inthe next time interval. Where C_{6} is the stochastic rate constant for reaction 6 |
Table 3: Example of a Stochastic Reaction.
Resource | URL | Brief description and reference |
---|---|---|
Bionumbers | www.bionumbers.org | Repository of enzymatic data |
BRENDA | http://www.brenda-enzymes.info/ | Database that contains the details of almost 80,000 different enzyme catalysed reactions |
KEGG | http://www.genome.jp/kegg/ | A resource that archives genomic, chemical and pathway data. It also contains links to various external databases [110] |
MIPSMPPI | http://mips.helmholtz-muenchen.de/proj/ppi/ | A Protein-Protein Interaction database that archives data from PPI obtained from experiments. |
Reactome | http://www.reactome.org/ReactomeGWT/entrypoint.html | A pathway database Pathway Browser based on an SBGN visualization system that allows users to view and analyse pathways in detail [111] |
SABIO-RK | http://sabio.h-its.org/ | A web-accessible database which stores kinetic parameters measured under defined assay conditions |
Table 4: Partial List of kinetic, pathway and protein resources to facilitate model building.
Figure 7: Illustrative stochastic model of the folate cycle a) Typical set of reactions used when creating a stochastic model. b) Selecting a stochastic simulator (the Gibson-Bruck simulator) to run a stochastic simulation in Copasi. c) Overview of the steps in a stochastic algorithm. d) Stochastic simulations of the simple folate cycle model.
viii) If the output of the model appears to be a realistic interpretation of the dynamical behaviour of reactants and products then the model can be used to explore the idea or hypothesis. (ix) Confirmation/ disproval of hypothesis. One can choose to except or reject the original hypothesis based on model output. If rejected one can refine the research question/model, thus model building becomes a cyclic sequence of iterative steps of refinement and revalidation.
Computational systems modelling currently faces a number of challenges; the resolution of which are crucial to the continued integration of computational modelling within conventional bioscience methodologies. A significant issue relates to the limitations of existing simulation algorithms. For example, in this paper we have discussed briefly the principles of stochastic simulation with the Gillespie algorithm or one of its derivatives. However, the Gillespie algorithm and its variants are limited due to the fact that they are time consuming and require several runs to complete parameter scanning tasks [84]. This is computationally expensive and thus the efficiency of stochastic simulation needs to be addressed in the future in order to model more complex biological systems that contain a large number of reactants/parameters [84]. Recently a number of solutions have been proposed for this problem. For example, it has been suggested that one way to overcome this issue is to use hybrid parallel execution on graphics processing units with a variant of the Gillespie algorithm [85]. This method involves executing a simulation across threads in parallel and according to the author’s leads to 8×-120× improvement in performance over other parallel algorithms [85]. Parameter sensitivity and robustness of systems models has also been identified as a limitation of systems models. For example, in a study by Gutenkunst et al. [85], the authors indicate that every model they examined from the Biomodels database had a “sloppy” spectrum of parameter values. As a result of these findings the authors propose that collective fitting be used more routinely and also suggest that modellers focus more on model predictions. This issue also leads to the key challenge of parameter inference. It is not unusual that only selections of kinetic parameters are known for a given model, leaving the remainder of the parameters to be estimated/fitted. Many solutions have been proposed to improve this limitation. For example, an extended Kalman filter, for determining estimates of model parameters has been suggested as a means of overcoming this challenge [86,87]. Another recent approach referred to as Swarm-based Chemical Reaction Optimization, combines an evolutionary searching strategy with the Firefly Algorithm method [88]. Approximate Bayesian techniques have also been employed for parameter inference, for example, ABC-SysBio http://www.theosysbio.bio.ic.ac.uk/resources/abc-sysbio/ is a Python package which implements likelihood free parameter estimates and model selection for SBML encased models [89]. In the future computational systems biology must also face the grand challenge of scale as currently it can be argued no system model as adequately been able to represent the multitude of interactions that take place between different temporal and spatial scales. For example, at present models need to improve how they integrate across temporal scales which can range from micro seconds for cellular processes to years for whole organisms. Likewise with spatial scale we are dealing with nanometer cellular structures through to meters for whole organisms. There a few examples of projects that have attempted to do this, such as the multi-scale model of juxtacrine EGFR-MAPK signalling [90] and a multi-Scale model of hemodynamics within a whole-body arterial network [91].
We have reviewed the Biomodels database to highlight nutrientthemed models. Additionally a generic modelling framework was used to discuss current tools and recent advances in computational systems modelling. The reason for conducting this review was to highlight the impact computational biology is having on nutrition research and to explore its future possibilities. The review uncovered areas where scope exists for extending current models or for developing new ones. For example, a number of models of intracellular calcium metabolism were discovered. It would be worthwhile developing one of these models further to include Vitamin D metabolism, especially as Vitamin D is essential for calcium absorption [92], with its physiological optimum defined by the amount required to maintain calcium levels and prevent secondary hyperparathyroidism [93]. Therefore, a computer model could help to investigate the long-term impact of vitamin D deficiency on calcium metabolism. As the calcium models are encoded within the SBML schema it is feasible they could be extended or combined with another model. The latter could be facilitated by Semantic SBML [94]; a recently developed tool that helps modellers to locate annotations and merge models http://www.semanticsbml.org.
Models of folate metabolism are also well represented within Biomodels. These models could have additional research potential as recent findings suggest that age related epigenetic changes could be the result of folate cycle irregularities/alterations to dietary folate intake [95]. Mechanistically, folate deficiency has been associated with both global genome hypomethylation and gene promoter hypermethylation [96,97]. Moreover, recent findings suggest a correlation between the degree of gene promoter methylation and homocysteine levels in postmenopausal women [98], indicating that dysregulation of this pathway could result in promoter methylation. It would be worthwhile adjusting an existing model of the folate cycle to explore its interaction with DNA methylation. To our knowledge no model to date has represented the interplay between a deterministic folate cycle model and the discrete and stochastic methylation seeding events suggested to be associated with CpG island methylation [99]. A means of representing this could involve a hybrid modelling approach; where the model is partitioned and some variables are deterministic while others are stochastic. For example, a hybrid approach similar to that recently developed by Tyson and colleagues to model the cell cycle could be applied [100]. Or the approach adopted by Mallet and Pillis, where a hybrid model was used to simulate tumour dynamics between a tumour and the immune system [101].
Various models of lipid metabolism are archived in Biomodels. These models include a Boolean network model of cholesterol homeostasis in the non-curated section [36] and a whole-body model of cholesterol metabolism in the curated section [14]. Such models are timely as recent findings indicate that variations in lipid profile could play a role in human longevity/healthy ageing [102,103]. Computational modelling is well suited to investigate the dynamic components of lipid metabolism, such as the interactions that take place between lipoproteins. Modelling can also investigate the impact of diet on lipid profile; this is important as a finely tuned lipid profile is a hallmark of certain individuals with exceptional longevity [103,104]. Furthermore, scope exists for integrating a lipid model with other pathways that interconnect during metabolic syndrome [103]. As Biomodels contains models of insulin regulation and glucose metabolism this is a future possibility.
Modelling could also support nutrient focused investigations of whole-body responses to variations in dietary intake. However, standalone cellular/physiological models are an insufficient way of doing this at present. To address this issue as we outlined both whole-body and multi-scale models are a necessity [14]. These need to focus on multi-scale responses to nutrients over an extended range of time. This leap forward could be facilitated by the recent launch of the workflow based software EPISIM [104,105]. EPISIM is a tool designed to assist with the semantic integration of models that are coded in the SBML framework. The workflows permit importing of and access to SBMLbased models. Species, reactions and parameters are semantically integrated in cell behavioural models (CBM) represented by graphical process diagrams, thus facilitating the integration of models across different time scales. Interestingly this tool can also be used together with Copasi. Recon 2 is another way that this may become possible http://humanmetabolism.org/. Recon 2 is an international effort to comprehensively represent human metabolism in a computational format. According to the developers of Recon 2 it can predict changes in metabolite biomarkers for 49 inborn errors of metabolism with 77% accuracy when compared to experimental data [106,107]. The SBML schema for Recon 2 is located in the non-curated section of Biomodels (MODEL1109130000). Tools such as Recon 2 and EPISIM could make it possible to integrate whole-body-biological systems from gene, to tissue and to organ. This is something that will have particular benefits for the field of nutrition.
Mark Mc Auley would like to thank Professor Kenneth Newport, Pro Vice- Chancellor of Research and Academic Development for his continued support of this and other research he is actively engaged with.