Vilma G Duschak*
Instituto Nacional de Parasitología, ANLIS-Malbrán, Ministerio de Salud de la Nación, Argentina
Received date: June 10, 2015; Accepted date: November 20, 2015; Published date: November 24, 2015
Citation: Duschak VG (2015) Synthetic Biology: Computational Modeling Bridging the Gap between In Vitro and In Vivo Reactions. Curr Synthetic Sys Biol 3:127. doi:10.4172/2332-0737.1000127
Copyright: © 2015 Duschak VG. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Current Synthetic and Systems Biology
The synthetic biology firstly refers to the design and fabrication of biological components and systems that do not already exist in the natural world and to the redesign and fabrication of existing biological systems. The link of computational tools to cell-free systems, converts to synthetic biology is an emerging field expert to build artificial biological systems through the combination of molecular biology and engineering approaches. Herein, most findings describing the differences between in vivo and in vitro reactions and systems have been extensively described. The specific applications of computational tools to the design of an in vitro gene expression platform known as the artificial cell, its components and the strategies developed to predict activities of processor modules and to control the expression of genes have been discussed in detail. Potential applications of artificial cells in drug delivery, in biosynthesis, among others, have been described. Two sources of models for the possible developing of the computational toolbox for cell-free synthetic biology include: i) Physical models of single cellular components able to be created from original principles, guiding to focus on tools to predict structure and dynamics of particular components; ii) A wide-range of mathematical models for predicting system dynamics of natural cells. Regarding modeling algorithms, there is a broad kind of models available for synthetic biologists and some areas of potential growth identified for researchers interested in developing tools for cell-free systems. Among them, deterministic, exploratory, molecular dynamic, stochastic, all atom models, among others, have been described and discussed. By using computational models to set up quantitative differences between in vitro reactions and in vivo systems, could identify specific mechanisms in living organisms to be further used in in vitro reactions in order to facilitate their processes. Thus, computational modeling would bridge the gap between in vitro and in vivo reactions.
Biological components; Molecular biology; Exploratory; Molecular dynamic; Stochastic
In the last two decades, due to synthetic biology, several goals have been headed for the rational design of biological systems. Since the start of synthetic biology in the 1990s to the explosion of genomics data in the early 2000s, a new discipline has emerged. The exact definition of synthetic biology is still an interesting question. In 1978, with the discovery of restriction enzymes, appeared the earlier uses of this concept. The engineering of microorganisms for the production of compounds has been followed for a long time. Now, based on the arrival of genomebased methods, synthetic biology is a rigorous engineering discipline to create, control and program cellular behavior. In this perceptive the end goal is to finally be able to engineer a system/organism to perform how we want it to perform . This concept has also been extensively used in the discovery and understanding of natural product research in different microorganisms. The aim of synthetic biology is to identify biological design principles that can be used for practical applications. Although the most recent innovation steps in synthetic biology have been centered on research, it occurred in the meeting point between rational design and natural complexity with a final aim to develop biotechnological applications. The vast increase of DNA assembly techniques and the genetic tools currently available for synthetic biologists have been recently reviewed allowing the achievement of new functions and the production of helpful metabolites in living cells in a controlled way . In order to link computational tools to cell-free systems, synthetic biology is an emerging field that endeavors to build artificial biological systems through the combination of molecular biology and engineering approaches. The progress in the design and construction of synthetic genetic and protein networks has determined the relevant growth of this field. This has led to the possibility of assembling modular components to arrive at novel biological functions and tools. In addition, these synthetic networks give rise to insights that facilitate the investigation of interactions and phenomena in naturallyoccurring networks. Amalgamation of well-characterized biological components into higher order networks requires computational modeling approaches to rationally construct systems directed towards a wanted ending. A computational approach would improve the certainty about the causal mechanisms that, if not, would be difficult to be inferred in the course of research experiments alone. The analysis and understanding of both qualitative and quantitative models also becomes increasingly important towards taking a systems-level point of view on synthetic genetic and protein networks. The analogy of synthetic networks to circuit engineering, computational modeling approaches that can be applied to biological systems and how synthetic biology will help in the development of more precise in silico representations of these systems has been recently described in detail .
The synthetic biology refers to
• The design and fabrication of biological components and systems that do not already exist in the natural world and the redesign and fabrication of existing biological systems.
• In the first case, unnatural molecules are used to mimic natural ones with the aim to create artificial life. In the second case, natural molecules are used and they are assembled into a system that acts artificially. Generally, the aim to solve problems that are not easily understood only by analysis and observation, it is only achieved by the manifestations of novel models. To date synthetic biology has produced diagnostic tools for diseases produced by viruses such as HIV and Hepatitis Virus as well as tactics from biomolecular parts with interesting functions. The term synthetic biology was first used on genetically engineered bacteria that were created with recombinant DNA technology which was 4 synonymous with bioengineering. Later, the term was used as a mean to redesign life which is an extension of biomimetic chemistry, where organic synthesis is used to generate artificial molecules that mimic natural molecules such as enzymes. Recently, the engineering community is seeking to extract components from the biological systems to test and confirm them as building units to be re-assembled in a way that can mimic the living nature.
This engineering discipline builds on our mechanistic understanding of molecular biology to program microbes to carry out new functions. Such predictable manipulation of a cell requires modeling and experimental techniques to work together. The modeling component of synthetic biology allows one to design biological circuits and analyze its expected behavior. The experimental component merges models with real systems by providing quantitative data and sets of available biological “parts” that can be used to construct circuits. Sufficient progress has been made in the combined use of modeling and experimental methods, which reinforces the idea of being able to use engineered microbes as a technological platform . In the engineering aspect of synthetic biology, the suitable parts are the ones that can contribute independently to the whole system so that the behavior of an assembly can be predicted. DNA consists of double-stranded antiparallel strands each having for various nucleotides assembled from bases, sugars and phosphates which are made of carbon, nitrogen, oxygen, hydrogen, and phosphorous atoms. The simplicity found as union in base pairs A with T and C with G in addition to minor changes in the Watson and Crick model, is not found in complex proteins. The analysis and observation allow researchers to persuade themselves that the paradigms are the truth and whether the data contradicts the theory, they are discarded as considered errors, while synthesis promotes researchers to identify new theories. Synthesis has long been used in chemistry. The combination of Chemistry, biology and engineering can therefore create Darwinian systems . Synthetic biology based on a six-letter genetic alphabet that includes the two nonstandard nucleobases isoguanine (isoG) and isocytosine (isoC), as well as the standard A, T, G and C, is known to suffer as a consequence of a minor tautomeric form of isoguanine that pairs with thymine, and therefore leads to infidelity during repeated cycles of the PCR. It was recently determined that the A, 2-thioT, G, C, isoC, isoG alphabet is an artificial genetic system capable of Darwinian evolution .
The research field of synthetic biology combines the investigative nature of biology with the constructive nature of engineering. In synthetic biology, most efforts have been focused on the creation and perfection of genetic strategies and small modules which have been constructed from these devices. But to view cells as true ‘programmable’ entities, it is now essential to develop effective strategies for assembling procedures and modules into complex, customizable larger scale systems. The step from modules to systems represents the second wave of synthetic biology. Therefore, the ability to create such systems address to innovative approaches for a wide range of applications, such as bioremediation, sustainable energy production and biomedical therapies .
Finally, on one hand, it is expected that synthetic biology creates great opportunities in a wide range of areas, including in foods, therapeutics, and diagnostics subject to regulatory supervision by the United States Food and Drug Administration (FDA). However, on the other hand, there are simultaneous misgivings of precisely assessing the human health and environmental risks of such synthetic biology products. Productive Oversight Assessment (POA) will go forward the development of a generalizable approach for making productive planning and decision-making about the supervision of any given new technology, enhancing preventive and adaptive approaches by providing the conditions that will make it possible and helpful data to support future normative discussions about the control of emerging technologies .
Cell-free synthetic biology is emerging as a powerful technology aimed to understand, connect, and increase the capacity of natural biological systems without using intact cells. Cell-free systems bypass cell walls and remove genetic regulation to enable direct access to the inner machinery of the cell. The unprecedented level of control and freedom of design, relative to in vivo systems, has inspired the rapid development of engineering skill for cell-free systems in recent years. The current characteristics of a cell-free expression system include the lack of spatial arrangement, protein transport, and folding, as well as various non-DNA binding factors that modulate gene expression in living organisms. Although these differences between in vivo and in vitro reactions are qualitative, they could produce differences capable to be quantified in dynamical behavior between the two systems, which would require different modeling approaches .
Mathematical models became more commonly integrated into the study of biology as the mode for describing biological processes. Several tools have emerged for the recreation of in vivo synthetic biological systems, with only a few examples of well-known work done on predicting the dynamics of cell-free synthetic systems . All at once, the beginning of studying the dynamics of in vitro systems, encapsulated by amphiphilic molecules, opened the door for the development of a new generation of bio-mimetic systems. in vivo and in vitro models of biochemical networks are specially focused on tools that could be useful for producing cell-free expression systems. Quantitative studies of complex cellular mechanisms and pathways in synthetic systems can surrender important insights into the differences between cells and conventional chemical systems. With the aim to simplify the understanding of biological systems by constructing biochemical pathways and constructing computational models to reproduce the behavior of those pathway, synthetic biologists have first documented modeling and simulation of genetic regulatory systems as well as outlined the basic features of synthetic biology as a new engineering discipline, covering examples from the literature and reflecting on the features that make it exclusive among all other existing engineering fields . Self-repair and proofreading are cellular processes that have not been considered when constructing cell-free synthetic systems [11- 14]. The lack of these features of in vitro systems could complicate the adjustment to existing computational tools for the design of cell-free systems.
Interesting details about the architecture of biological networks were revealed with the shift toward an engineering mode of conducting tests . Firstly, this shift occurred but has mainly ignored the integration of older methods of biological analysis, in particular in vitro biology. in vitro synthetic biology is an emerging area focused on complex biosynthesis, directed evolution, and reconstitution of biological functions. Construction of a chemical system capable of replication and evolution, fed only by small molecule nutrients could be achieved by stepwise integration of decades of work on the reconstitution of DNA, RNA and protein syntheses from pure components . In recent years, the design of in vivo systems brought to the rapid development of engineering foundations for cell-free systems, for offering a versatile test-bed for understanding why nature’s designs work efficiently and also for enabling biosynthetic routes to novel chemicals, sustainable fuels, and new classes of tunable materials. The emergence of cellfree systems open the way to novel products that until now have been unfeasible to produce by other means, are transformed by biochemical engineering or require novel bioproduction strategies [17-19].
Iin vitro reactions (sometimes named cell-free systems) are defined as a collection of biochemical components used to quantify properties of biological systems and/or produce biological Bacillus subtilis products, such as nucleic acids, polypeptides, or metabolites. Conventional in vitro systems are routinely used in biochemistry to measure 1) binding affinity: it was early used for steroid and phytooestrogen binding to their cognate receptors , for evidencing the binding affinity of 23 halogenated dibenzo-p-dioxins and dibenzofurans for C57BL/6J mice hepatic cytosol-binding species closely correlated with the potencies of these compounds as inducers of hepatic aryl hydrocarbon hydroxylase activity , reported the in vitro binding affinity of the AbrB protein, a transcriptional regulator of many B. subtillis genes to to six different DNA target regions; 2) assess reactivity for evaluating interaction between proteins, sequence specific interaction between DNA and chromosomal proteins, analysis of lipid-protein complexes by circular dichroism indicating that there was an increase in helical structure concomitant with lipid-protein binding demonstrating the interactions lipid-protein in high density lipoproteins , analysis of the distribution and evaluation of the in vivo and in vitro IgE to cross-reacting carbohydrate determinants, finding that these are common among the allergic population , and 3) determine molecular structure of cellular components, taking advantage of properties of formaldehyde as a DNA-protein cross-linker to probe the distribution of nucleosomes from chromatin structure in vivo , and the usefulness and applications of RNA chemical probing technologies in the last decade including new sequence-independent RNA chemistries, algorithmic tools for high-throughput analysis of complex data sets composed of thousands of measurements, new approaches for interpreting chemical probing data for both secondary and tertiary structure prediction, simple methods for following timedependent processes in RNA structural biology .
In vitro and In vivo systems
Early, since the discovery that a soluble ribonucleic acid intermediate in protein synthesis and the synthesis of a coat protein by phage containing RNA- Escherichia coli extracts, reconstituted in vitro systems have been used to demonstrate the molecular basis of transcription and translation in vivo [26,27]. in vitro systems are also used in high-throughput screening of proteins, RNA and DNA. In this sense, can be mentioned: 1) libraries of native folded proteins which could be screened and made to evolve in a cell-free system without any transformation or constraints imposed by the host cell by using ribosome display  the global analysis of protein activities using proteome chips, allowing that microarrays of an entire eukaryotic proteome can be prepared and screened for diverse biochemical activities and also can be used to screen protein-drug interactions and to detect post-translational modifications  and the development of appropriate resources and expression technology necessary for human proteomics for converting the transcriptome into an in vitro-expressed proteome for research use  2) regarding RNA, RNA molecular switches were created by a combinatorial strategy named “allosteric selection”, which favored the emergence of ribozymes that rapidly self-cleave only when incubated with their corresponding effectors compounds  knowing that in the presence of the effectors, the allosteric ribozyme ligase generates templates that can subsequently be amplified using conventional amplification technologies, such as RTPCR, thus, by in vitro section, the allosteric ribozyme can transduce analytes into amplicons . A broadly applicable method for coupling a novel, newly selected aptamer to a ribozyme to generate functional aptazymes via in vitro and in vivo selection was described. To this aim, in addition to synthetic biology, metabolic engineering was also used by the development of genetic control parts. Among riboswitch parts, one of them with great potential for sensing and regulation of protein levels is aptamer-coupled ribozymes (aptazymes). Thus, the dual-selection for evolution of in vivo functional aptazymes as riboswich parts was developed  and 3) More than two decades ago, procedures for facilitating the rapid study of sequence-specific interactions of proteins and DNA were required. A general method of in vitro obtainment and specific mutagenesis of DNA fragments was developed. Specific, endlabeled DNA fragments prepared using PCR, suitable for use in DNase I protection footprint assays, chemical sequencing reactions, and for the production and analysis of paused RNA polymerase transcription complexes in conjunction with the introduction of a specific mutation at any position along the length of PCR-fragments . Previous examples showed that a high-throughput screening of RNA compounds is often used in directed evolution experiments to develop riboswitches and other auto-catalytic RNA structures which can be useful in in vitro biosynthetic applications . The properties related with in vitro systems resound with the approaches of synthetic biology and have in fact been subjugated to create complex circuitry in cell-free systems. Among them, can be mentioned 1) the construction of an in vitro bistable circuit from synthetic transcriptional switches; 2) the synthetic in vitro transcriptional oscillators and 3) the bottom-up construction of in vitro switchable memories. Regarding processing using biochemical circuits, essential for survival and reproduction of natural organisms, artificial transcriptional networks consisting of synthetic DNA switches, regulated by RNA signals acting as transcription repressors, and two enzymes, a T7 RNA polymerase and an Escherichia coli. coli ribonuclease H, were engineered. An in vitro bistable memory was constructed by wiring together two synthetic switches. Construction of larger synthetic circuits provides an opportunity for evaluating model prediction, and design of complex biochemical systems and could be used to control nanoscale devices and artificial cells . On the topic of the three synthetic in vitro transcriptional oscillators, initially, a negative feedback oscillator comprising two switches, regulated by excitatory and inhibitory RNA signals were designed and showed up to five complete cycles. Finally, a three-switch ring oscillator was constructed and analyzed. Mathematical modeling guided the design process. In this way, an in vitro oscillator was developed by using cellular machinery to transcribe a pair of nicked-promoter constructs. The former one produces a transcript that inhibits the second construct by strand displacement, while the second one produces an RNA oligo that activates the first construct. Thus, the system forms a negative feedback loop that produces oscillation in the activities of the promoters. Synthetic transcriptional oscillators could prove valuable for systematic exploration of biochemical circuit design principles and for controlling nanoscale devices and orchestrating processes within artificial cells . Concerning the bottom-up construction of in vitro switchable memories, a bistable system, a two-input switchable memory element, and a single-input push-push memory circuit were reported, suggesting that it is possible to build complex time responsive molecular circuits by providing an unmatched opportunity to study topology/function relationships within dynamic reaction networks .
The integration between in vitro systems with other materials creates hybrid constructs: some examples can be described here. Chemical sensors respond to the presence of a specific analyte in a variety of ways, one of them considers a change in optical properties, particularly a visually color change, such as polymerized colloidal crystal range of analytes, including viruses . In addition, novel hybrid hydrogels containing rationally designed single strand DNA(ssDNA) as the cross-linker have potential in applications such as DNA-sensing devices and DNA-triggered actuators . Taking into account that living organisms have unique homeostatic abilities, a versatile strategy for creating self-regulating, self-powered, homeostatic materials capable of precisely tailored chemo-mechano-chemical feedback loops on the nano- or microscale were presented . Besides, nanometer-scale transmembrane channels in lipid bilayers were created by means of self-assembled DNA-based nanostructures. In single-channel electrophysiological measurements, single-molecule translocation experiments showed that the synthetic channels could be used to discriminate single DNA molecules . Moreover, a secondgeneration glucose-driven novel chemo-mechanical autonomous drug release system was fabricated and evaluated toward feedback control of blood glucose in diabetes, without any requirement for external energy . One of the most active directions in synthetic protocell biology is the “reconstruction’ approach”, where macromolecules are encapsulated in vesicles or liposomes and catalyze metabolic functions necessary in the life cycle of the protocell [44, 45]. Among them, those including the synthesis of poly-A RNA in self-reproducing vesicles  the replication of an RNA template in liposomes, and the compartmentalization of PCR [47,48] deserve to be mentioned. This work verified that enzymatic activity could occur inside a liposome and direct the de novo synthesis and replication of nucleic acids. However, some studies have shown that liposomes can affect gene expression [49, 50].
Different from the reconstruction approach, the Los Alamos Bug model  and versions of it, based on the self-assembling approach, have been proved to fulfill this condition, an emergent property of the catalytic coupling of protocell’s subsystems. Opposed to these catalysis-based protocell models, a stoichiometric model of a protocell, the chemoton model  requires the coordinated growth of all its components by means of an exact imposed stoichiometry in the transformation nutrients-metabolites-waste. Further efforts in the field yielded enzymatic synthesis of membrane lipids inside liposomes to increase compartment size  evidence of base pair recognition between components of a phosphatidyl nucleoside membrane, and poly(Phe) production inside liposomes loaded with ribosomal components [54,55]. With the advent of encapsulated protein synthesis, there was a focused attempt to reproduce key features of cellular systems using artificial cells [56-58]. An in vitro proposal providing an alternative for synthetic biologists to develop freeze-dried in vitro reactions stored on paper disks for applications in portable diagnostic systems was recently programmable for in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors .
Cell-free protein synthesis is becoming a serious alternative to cellbased protein expression. Previous work to control cellular behavior have mainly based on genetic engineering. However, other methods of cellular control are possible. So, artificial, non-living cellular mimics could be engineered to regulate already existing natural sensory pathways of living cells through chemical communication. Artificial cells can perform fundamental functions associated with natural cells, such as formation of membrane pores a) via alpha hemolysin expression, a concurrent cell-free expression and insertion of membrane proteins into phospholipid bilayers. A model is presented which describes the kinetics of adsorption of the expressed protein on the phospholipid bilayer  b) execution of genetic programs like the introduction of a positive feedback loop into a LacI-dependent gene expression system in lipid vesicles, producing a cell-like system that senses and responds to an external signal with a high signal-to-noise ratio  and c) other processes associated with sensing and responding to the environment such as synthetic riboswitches can be used to control protein expression under fully defined conditions in vitro, in water-in-oil emulsions, and in vesicles as well as in the artificial cells [62,63].
The quantitation of the effects of molecular phenomena, such as encapsulation, molecular crowding, and reaction volumes on the performance of in vitro transcription and translation could provide insights into key molecular phenomena capable to impact on in vivo gene expression and also demonstrate how molecular transporters and secondary metabolic reactions modulate homeostasis of natural cells . Molecular crowding, encapsulation, and reaction volumes all profoundly affect stochastic variation of gene expression, which in turn impacts the choice between mass-action and ordinary differential equations for prediction of protein synthesis (Figure 1). In addition, cell-free systems lack a continuous supply of substrates, supplementary transcription factors, and chaperones, which could radically alter the rates of peptide and/or metabolite production in vivo. These factors could change kinetic parameters that function in in vitro systems. Whereas artificial cells cannot currently undergo self-reproduction  they have been used to gain insight into features of natural cells, including molecular crowding, on interactions and kinetics of the fundamental machinery of gene expression having a direct impact on our understanding of biochemical networks in vivo  and can increase the robustness of gene expression by integrating synthetic cellular components of biological circuits and artificial cellular nanosystems, the effects of compartmentalization on the kinetics of multimeric protein synthesis, and RNA-facilitated encapsulation [66- 68]. Primordial cells presumably combined RNAs, which functioned as catalysts and carriers of genetic information, with an encapsulating membrane of aggregated amphiphilic molecules. Thus, aggregates of a prebiotic amphiphile bind certain heterocyclic bases and sugars, including those found in RNA, stabilizing the aggregates. These mutually reinforcing mechanisms might have driven the emergence of protocells .
Figure 1: Modeling-based questions of cell-free and in vivo synthetic systems. A hypothetical synthetic pathway (boxes and arrows) is modeled on a computer. The simulated expression dynamics are compared to biological and cell-free interactions of the process of interest. By using computational models to establish quantitative differences between in vitro reactions and in vivo systems, mechanisms in living organisms that contribute to desirable network behavior could be identified. These mechanisms could be added to in vitro reactions, conferring useful properties on their processes. This way, computational modeling would bridge the gap between in vitro and in vivo reactions (Adapted from Lewis et al., ).
This section will describe and discuss the specific applications of computational tools to the design of an in vitro gene expression platform known as the artificial cell. To describe a model of artificial cells, the components of the system have been classified into the Input, Processor, Output, and Shell.
a) The Input has been defined as the starting concentrations of enzymes, metabolites, co-factors, substrates, inducers and chemical energy that are present in a system and used in the execution of in vitro reactions. It is known that these factors alter gene expression significantly. However, their effects on gene expression are not disturbed in vivo . This problem was solved in mathematical modeling by providing a framework to evaluate the precision of Input concentrations. In particular, a computational model was produced to determine the epistatic interactions among a high number of components in an in vitro system of transcription/translation . A different model utilizing a machine learning algorithm capable to stochastically show discrepancies among different components of the Input was also reported. The patterns of protein expression levels showed either a rapid rise of protein production or an initial decrease in protein level before a rapid increase . In order to develop new in vitro models that describe the effects of the Input on gene expression, a stochastic model of an in vitro transcription/translation system was used with different reaction components such as potassium and magnesium to study the effect of environmental changes on gene expression, and physiological and supplemented concentrations of intracellular components within cell-free systems were compared [66,72].
b) The processor is defined as the cellular circuit (DNA sequence) that dictates genetic composition and functional relationship between genes, in conjunction with the machinery required to interpret it such as RNA polymerase enzymes and transcription factors involved and translation machinery.
c) The output is described as the concentration of the final product(s) of a system. This may be defined as the product of the activity of the processor (metabolites for enzymatic reactions, mRNA for transcription systems, protein for coupled transcription/translation systems, etc). The integral understanding of gene expression involves to the processor and the output. These modules are decisive to connect input signals to functions of synthetic biological systems.
d) The shell is defined as the liposome barrier that controls interaction between artificial cells and the environment and /or isolates the Input and the Processor from the environment. The diameter of the Shell can determine the degree of molecular crowding and reaction rates of the Processor. In addition, the Shell controls the import of signals from the environment and export of Output compounds from intracellular space of artificial cells. With beginning of synthetic biology it is possible to connect functioning principles of natural membranes for the control of the Shell of artificial cells. Natural membranes use several strategies involving membrane proteins and lipid rafts to carry out information exchange with the environment . Computational tools helped to reproduce the complexity of natural membranes, shutting the breach between natural and artificial membranes by simulating dynamics of lipid bilayers and their interactions with the environment [74-76].
Strategies developed to predict activities of processor modules and to control the expression of genes.
Several of these approaches have been validated primarily in natural cells, but could be adapted for cell-free systems. Among them can be mentioned: -Sequence-based control of promoter transcriptional activity. In order to control the expression levels of target proteins, cells have evolved mechanisms to regulate both transcription and translation rates. Libraries of artificial promoters with different sequences, have been compared by measuring the accumulation of reporter proteins [77-82]. In some cases, in order to associate promoter sequence and its strength, the experimental data allowed the construction of inference models .
Prediction of promoter strength Various approaches are used to predict promoter strengths. The definition of promoter strength is the association rate constants of RNAP-promoter complex. It means that if binding affinity of RNAP to a promoter is “weak”, the binding is low, resulting in a reduced transcription rate. By contrast, if it is high, the promoter is “strong,” which increases output accumulation. Among cellular biochemical functions, gene transcription has been adapted to differential Hill functions-like equations, in order to predict the activity of a promoter, based on the binding affinity of RNAP and regulatory transcription factors to DNA, the position of the promoter, and the position of other regulatory sequences in the promoter . In order to study regulatory elements in cell-free systems, a method based on the Hill functions demonstrated that the relative activities of promoters correlated well between cell-free systems and bacterial systems .
In order to predict the strength of promoter, based on its nucleotide sequence, different models describing the causal associations between promoter sequences and their affinity to RNAP were constructed. In thermodynamic based models, the binding energy of RNAPs to DNA has been considered to be a linear addition of the individual energy barriers of each base in the promoter sequence . Some differences have been observed between models and experimental findings. False positives were reported when promoters were predicted to exhibit strong transcription rates, but in vivo were either weak or inactive [86,79]. False negatives were reported when some strong promoters were also not identified by the models . Other models have been developed and used to predict bacterial RNAP-promoter activity in vivo based on their sequences . Predictive models of promoter strengths in vitro are less well established when compared to in vivo models. Strengths of promoters with mutations in -35/-10 sequences and the upstream UP elements were assessed. The activity of the promoters both in vivo and in vitro was determined. A model to estimate the effects of the UP elements on promoter strengths was constructed. Position weight matrices for each motif were built in the promoters [79,89]. Based on the approach, the predicted and observed promoter strengths showed strong correlation between in vivo and in vitro systems.
Modeling T7 promoter activity for cell free circuits: The use of bacterial RNAPs has been difficult due to their multimeric composition and low transcription efficiency in cell-free systems and artificial cells. As an alternative, the use of the monomeric phage derived RNAPs, T7-RNAP, which is not regulated by transcription factors, and binds to specific T7 promoters can make easier the design and application of in vitro genetic circuits . To our knowledge, there are still no sequence-based and predictive models of T7 promoters strength published. Interestingly, T7 promoters with mutations in -11 to -8 bases were assayed using a split T7 RNAP, with C-terminal and N-terminal fragments individually expressed . Additionally, a library of twenty one T7 promoters was characterized in cell-free platforms using an in vitro transcription/translation system . There are reports suggesting that a sequence-based predictive model of T7 promoter strength could be developed and subsequently validated for in vitro control of gene expression levels [92, 93].
Control of translation initiation rate by modification of RBS (ribosomal binding site) Translation initiation rates can be controlled to modulate Output protein accumulation in vitro. The three main steps of the translation process are initiation, elongation, and termination . In bacteria, once the 16S rRNA, from the small ribosomal subunit 30S, interacts with the Shine-Dalgarno (SD) sequence, the initiation complex is completed with the binding of initiation factors and the large ribosomal subunit 50S. Additional sequences upstream and downstream the SD sequence determine the initial translation rate . RBS is formed by these sequences together with the SD sequence. RBS strengths can be predicted using multiple tools, including RBS calculator and RBS Designer, among others [96,97]. Differences of free energy between the folded secondary structures of a RBS (mRNA unbound to ribosomes) and its unfolded state (bound to the ribosome) are computed by the mentioned tools. The characteristics of RBS models were reviewed in detail . These tools have been successfully applied to fine-tune protein translation in natural cells . The use of RBS calculator revealed that the modification of RBS sequence altered protein levels. However, mRNA accumulation was not modified . UTR designer and RBS Designer were used in vivo to adjust translation rates of genes in metabolic operons, constructed a predictive library of RBS strengths, and showed accumulation of reporter genes in a lightinducible expression system, suggesting that RBS-based models can be used to predict the accumulation of target proteins in vivo [101,102]. Conversely, few publications evaluating RBS strengths in cell-free systems were reported and one of them showed that relative strengths of RBS were similar for in vitro and in vivo systems , suggesting that RBS strengths in vitro could be estimated using existing tools developed for in vivo systems probably because RBS models rely only on RBS sequence and secondary structures, in addition to interactions between RBS and ribosomes.
Control of gene output accumulation: Terminators are essential for the detachment of RNAP from DNA and release of the synthesized RNA. In the absence of efficient terminators, the RNAP will continue transcribing all through the DNA, diminishing the pool of RNAP available to initiate new transcription rounds. No additional factors are required by intrinsic terminators to be recognized by RNAP. In fact, the strengths of intrinsic terminators were estimated using only their DNA sequence in a number of biophysical models [103,104]. In this sense, a library containing more than 500 terminators was characterized in E. coli, and the strengths of the terminators were in line with predicted strengths based on a simple thermodynamic model . In addition, T7-RNAP terminators with almost the maximum efficiency have been developed . These models were tested in vivo. However, the fact that intrinsic terminators do not require additional factors suggests that they could potentially be applied to cell-free systems. Output accumulation can be controlled by other factors. Among them, translation efficiency can be altered by the target gene sequence, affecting the concentrations of synthesized proteins. In this sense, specific codons can be designed and optimized for a host or in vitro system  obtaining the highest protein production. In addition, the activity of a promoter and RBS depends on upstream and downstream sequences [108,95]. Thus, when designing cell-free systems, the implementation of insulator sequences that are related with surrounding sequences over the regulatory sequences must be taken into account [109,110].
Metabolites as output and cell-free metabolic pathways. Metabolic pathways can be predicted and controlled. Metabolites with high economic values, including antibiotics, chiral compounds, and proteins are termed biocommodities . Although usually, biocommodities are produced using microorganisms with engineered metabolic pathways, the complexity of the biosynthetic pathways can be reduced by isolating it from cellular metabolic network and specifically engineered to produce the desired target at determined rates. Considering that potential toxicity is associated with synthesizing a biocommodity invivo therefore, in vitro production of metabolites could solve this problem. In addition, theoretical calculations of product-to-biocatalyst weight ratios showed that in vitro systems achieve total turnover number (TTNW) at several orders of magnitude higher than microbial-based production . Cell-free systems have been shown to efficiently produce metabolites and proteins [97,112]. The development of models that precisely predict productivity of in vitro systems could get better synthesis of biocommodities. Several tools are available for the design of cell-free metabolic pathways.
Databases for methabolic pathways. Public access databases such as KEGG , MetaCyc , ChEBI , and RHEA , showed to be useful for the design of metabolic pathways. A database containing molecular and biochemical data of enzymes, BRENDA , can be useful to select the core pathway capable to produce the metabolite of interest. Web servers, such as From-Metabolite-To-Metabolite (FMM) , and Metabolic Route Search and Design (MRSD) , can also be used for designing synthetic and unique metabolic pathways in cell-free systems. Metabolic Tinker  can be used to identify and rank thermodynamically favorable pathways between two compounds, which may include novel, non-natural pathways. The XTMS platform can help to rank pathways based on enzymatic efficiency and maximum pathway yields . Flux balance analysis (FBA) is commonly used to calculate the relative contribution of each enzymatic step in the pathway when optimization of particular objective function is required . This flux is based on the stoichiometry of the metabolic pathway and several computational tools are available to solve FBA , such as COBRA toolbox for MATLAB  and the open-source version COBRApy .
Potential applications of artificial cells
In drug delivery. One of these cases was described by the use of pegylated liposomal doxorubicin (doxil) in which cumulative doses in excess of 500 mg/ml of doxil appeared to carry a considerably lesser risk of cardiomyopathy as judged by serial left ventricular ejection fraction and clinical follow-up, that was generally associated with free doxorubicin . In addition, liposomes are widely utilized in molecular biology and medicine as drug carriers. Thus, liposome-cell interaction through connexins was reported describing a new method for direct cytosolic delivery of hydrophilic molecules . The biochemical research line approach known as semi-synthetic minimal cells, which are liposome-based system capable of synthesizing the lipids within the liposome surface consist in reconstituting membrane proteins within liposomes and allow them to synthesize lipids . Current more complex models, however, require a full reconstruction of the biochemical pathway including the synthesis of functional membrane enzymes inside liposomes, followed by the local synthesis of lipids catalyzed by the in situ synthesized enzymes.
In biosensors. In this sense, synthetic ribo switches were used to control protein expression under fully defined conditions in vitro, in water-in-oil emulsions, and in vesicles. The developed system could serve as a foundation for the construction of cellular mimics that respond to particular selected molecules . Biochemical approaches to membrane receptors have been limited for years to the following methods: knockout or overexpression of membrane receptors by gene introduction and genome engineering or extraction of membrane receptor-surfactant complexes from innate cells and their introduction into model biomembranes. The development of a novel method involving gene expression using cell-free in situ protein synthesis inside model biomembrane capsules was described. This method was verified by synthesizing olfactory receptors from the silkmoth Bombyx mori inside giant vesicles finding that that they were excited in the presence of their cognate pheromone ligand .Taking into account that all cells sense and respond to their environment, artificial, nonliving cellular mimics could be engineered to activate or repress already existing natural sensory pathways of living cells through chemical communication. The construction of such a system was described. The artificial cells expanded the senses of E. coli by translating a chemical message that E. coli cannot sense on its own to a molecule that activates a natural cellular response. Thus, artificial and natural cells were integrated to translate chemical messages that direct E. coli behavior. This methodology could open new opportunities in engineering cellular behavior without exploiting genetically modified organisms .
In biosynthesis. The synthesis and the activity inside liposomes of two membrane proteins involved in phospholipids biosynthesis pathway was shown. The activities of internally synthesized glycerol- 3-phosphate acyltransferase (GPAT) and lysophosphatidic acid acyltransferase (LPAAT) encapsulated in liposomes by using a totally reconstructed cell-free system (PURE system) were confirmed by detecting the produced lysophosphatidic acid and phosphatidic acid, respectively. Through this procedure, the first phase of a design aimed at synthesizing phospholipid membrane from liposome was implemented . Proteoliposomes were directly prepared by synthesizing membrane proteins with the use of minimal protein synthesis factors isolated from E. coli (the PURE system) in the presence of liposomes. The first report showing that cell-free-synthesized water-insoluble membrane protein is directly integrated with a uniform orientation as a functional oligomer into liposome membranes was reported, indicating that a simple proteoliposome preparation procedure should be a valuable approach for structural and functional studies of membrane proteins . Regarding that the physical interaction between the cytoskeleton and the cell membrane is essential in defining the morphology of living organisms, a synthetic approach was used to polymerize bacterial MreB filaments inside phospholipid vesicles. When the proteins MreB and MreC are expressed inside the liposomes, the MreB cytoskeleton structure develops at the inner membrane indicating that the fibrillation of MreB filaments can take place either in close proximity of deformable lipid membrane or in the presence of associated protein. These findings seems that might be relevant for the self-assembly of cytoskeleton filaments toward the construction of artificial cell systems .
In bioenergy. Active inclusion bodies of recombinant polyphosphate kinase were obtained by simple washing of E. coli cells with non-ionic detergent and then they were immobilized in agar/TiO2 beads. Bioenergy beads charged by polyphosphate acting as rechargeable supply of adenosine nucleoside triphosphates (ATP/ NTP), a practical tool for synthesis of artificial receptors were obtained . In conclusion, artificial cells have been described as unique in vitro platforms for studying fundamental principles of biochemical pathways. In reality, they have been used to measure differences in the expression and stochastic variation of gene circuits caused by encapsulation. In order to create predictive models of artificial cells, existing design tools of gene circuits could be integrated with models of liposomes. Modeling cell-free systems whole-artificial-cell models could be used to predict the response of artificial cells to osmotic pressure and to understand plausible co-regulation of system dynamics by membranes and gene circuits. Computational models of artificial cells could unite chemical and biological theory, combining the defined and predictable nature of in vitro reactions with the robust and sensitive qualities of natural cells. To date, however, computational tools for modeling artificial cell systems have not been established .
The computational toolbox for cell-free synthetic biology could be developed using two sources of models.
Source 1- physical models of single cellular components can be created from first principles, guiding to focus on tools to predict structure and dynamics of particular components. The primary blockage to consistent high-resolution de novo structure prediction for small proteins appears to be conformational molecular biology. By using a combination of improved low- and high-resolution conformational sampling methods, improved atomically detailed potential functions that capture the jigsaw puzzle-like packing of protein cores, and highperformance computing, high-resolution structure prediction that can be achieved for protein domains lower than 85 residues . The design and evolution of a strategy to change the catalytic activity of an existing protein scaffold was reported. An approach involving simultaneous incorporation and adjustment of functional elements through insertion, deletion, and substitution of several active site loops, followed by point mutations to fine-tune the activity were used. Thus, the enzyme glycoxylase II was re-designed to lose its original catalytic action and instead carry a functional beta-lactamase domain, which conferred antibiotic resistance to bacteria that carried the modified protein. The resulting enzyme, evolved metallo beta-lactamase 8, completely lost its original activity and, instead, catalyzed the hydrolysis of cefotaxime increasing resistance to E. coli growth on 100-fold on cefotaxime . The structural features of the chaperonin cage, crucial for rapid folding of encapsulated proteins, were explored. GroEL and GroES form a chaperonin nano-cage for proteins up to approximately 60 kDa to fold in isolation. Small proteins (approximately 30 kDa) folded more rapidly as the size of the cage was gradually reduced to a point where restriction in space slowed folding dramatically. For larger proteins (approximately 40-50 kDa), either expanding or reducing cage volume decelerated folding. The authors suggest that the combination of these features, the chaperonin cage provides a physical environment optimized to catalyze the structural annealing of proteins with kinetically complex folding pathways . The creation of novel enzymes capable of catalyzing any wanted chemical reaction was a grand challenge for computational protein design. Then, two novel algorithms for enzyme design employing hashing techniques allowed searching through large numbers of protein scaffolds for optimal catalytic site position. In silico benchmark was described, based on the recapitulation of the active sites of native enzymes, thus allowing a rapid evaluation and testing of enzyme design methodologies. These methods can be directly applied to the design of new enzymes, and the benchmark provided a powerful in silico test for guiding improvements in computational enzyme design . The structural molecular dynamics of proteins was characterized at an atomic-level of detail by two fundamental processes in protein dynamics-protein folding and conformational changes. A 1-millisecond simulation of the folded protein BPTI revealed a small number of structurally distinct conformational states whose reversible inter-conversion was slower than local relaxations within those states by a 1000-fold factor . A great challenge in molecular biology has been the understanding of the process by which proteins fold into their characteristic threedimensional structures. In line with this, the results of atomic-level molecular dynamics simulations, over periods ranging between 100 μs and 1 ms, revealed a set of common principles underlying the folding of structurally diverse proteins. Early in the folding process, the protein backbone adopts a native like topology while certain secondary structure elements and a small number of non-local contacts form. In most cases, folding follows a single dominant route in which elements of the native structure appear in an order highly correlated with their tendency to form in the unfolded state. In the simulations conducted, the proteins, representing all three major structural classes, spontaneously and repeatedly folded to their experimentally determined native structures . Lately, computational protein design is becoming a powerful tool for tailoring enzymes for specific biotechnological applications. When applied to existing enzymes, computational re-design makes it possible to obtain orders of magnitude improvement in catalytic activity towards a new target substrate. Computational methods also allow the design of completely new active sites which catalyze unknown reactions in biological systems. Compared to established protein engineering methods such as directed evolution and structure-based mutagenesis, computational design allows for much larger jumps in sequence space; for example, by introducing more than a dozen mutations in a single step or by introducing loops that provide new functional interactions. Recent advances in the computational design toolbox have been carried out. They include new backbone re-design methods and the use of molecular dynamics simulations to improve the prediction of the catalytic activity of designed variants, and further enhance the use of computational tools in enzyme engineering .
Source 2 the initiation of systems biology created a wide-range of mathematical models for predicting system dynamics of natural cells. Dependent global effects, such as the abundance of RNA polymerases and ribosomes, on gene expression in bacteria depend not only on specific regulatory mechanisms, but also on bacterial growth. The observed growth-rate dependence of constitutive gene expression can be explained by a simple model using the measured growthrate dependence of the mentioned relevant cellular parameters. More complex growth dependencies for genetic circuits involving activators, repressors, and feedback control were analyzed and verified experimentally with synthetic circuits. This mechanism can promote the acquisition of important physiological functions such as antibiotic resistance and tolerance . Incorporation of kinetics and regulation into stoichiometric models gives pace to mass action stoichiometric simulation models. Thus, dynamic network models can be constructed in a scalable manner using metabolomic data mapped onto stoichiometric models, resulting in mass action stoichiometric simulation (MASS) models. Enzymes and their various functional states are represented explicitly as compounds, or nodes in a stoichiometric network. The feasible construction of MASS models represents a practical means to increase the size, scope, and predictive capabilities of dynamic network models in cell and molecular biology . A strategy for accurate prediction of metabolic fluxes by Flux balance analysis FBA combined with systematic and condition-independent constraints that restrict the achievable flux ranges of grouped reactions by genomic context and flux-converging pattern analyses was reported. Analyses of three types of genomic contexts, conserved genomic neighborhood, gene fusion events, and co-occurrence of genes across multiple organisms, were performed to propose a group of fluxes that are expected to be on or off simultaneously. The flux ranges of these grouped reactions were constrained by flux-converging pattern analysis. FBA of the E. coli genome-scale metabolic model was carried out under several different genotypic and environmental conditions, resulting in flux values that were in good agreement with the experimentally measured fluxes. Therefore, this strategy might be useful for accurately predicting the intracellular fluxes of large metabolic networks of hard experimental determination . The rate of cell proliferation and the level of gene expression in bacteria are intimately entwined. Elucidating these relations is important in order to understand the physiological functions of endogenous genetic circuits and for designing robust synthetic systems. A study revealed intrinsic constraints governing the distribution of resources headed for protein synthesis and other aspects of cell growth. A theory incorporating these constraints can precisely predict how cell proliferation and gene expression affect one another, quantitatively accounting for the effect of translation-inhibiting antibiotics on gene expression and the effect of unnecessary protein expression on cell growth .
These tools have been used to describe diverse biological functions, including somitogenesis, T-cell antigen discrimination, and heterogeneous vesicle formation, among others. An example of auto-inhibition with transcriptional delay was represented by a simple mechanism for the zebrafish somitogenesis oscillator. In zebrafish, two linked oscillating genes, her1 and her7, coding for inhibitory gene regulatory proteins, were especially implicated in genesis of the oscillations, while Notch signaling appeared necessary for synchronization of adjacent cells. It was shown by mathematical simulation that direct auto-repression of the genes her1 and her7 by their own protein products provided a mechanism for the intracellular oscillator, with a period determined by the transcriptional and translational delays. Although they are simple, to understand them, mathematics is needed . No existing model accounts for absolute distinction between closely related T cell receptors (TCRs) ligands while also preserving the other canonical features of T-cell responses. The unexpected highly amplified and digital nature of extracellular signal-regulated kinase (ERK) activation in T cells was reported. Based on this observation and evidence that competing positive- and negative-feedback loops contributed to TCR ligand discrimination, a new mathematical model of proximal TCR-dependent signaling was constructed. The model made clear that competition between a digital positive feedback based on ERK activity and an analog negative feedback involving SH2 domain-containing tyrosine phosphatase (SHP-1) was critical for defining a sharp ligand-discrimination threshold while preserving a rapid and sensitive response. The combination between these findings and experiments performed revealed that ligand discrimination by T cells is controlled by the dynamics of competing feedback loops that regulate a high-gain digital amplifier, which is itself modulated during differentiation by alterations in the intracellular concentrations of key enzymes . The generation of non-identical compartments in vesicular transport systems can be explained by a mathematical modeling It shows that a minimal system, in which the basic variables are cytosolic coats for vesicle budding and membrane-bound soluble N-ethyl-maleimidesensitive factor attachment protein receptors (SNAREs) for vesicle fusion, is sufficient to generate stable, non-identical compartments. The stable steady state is the result of a balance between this autocatalytic SNARE accumulation in a compartment and the distribution of SNAREs between compartments by vesicle budding. The resulting nonhomogeneous SNARE distribution generates coat-specific vesicle fluxes determining the size of compartments . The distinction between forward and reverse modeling was pointed out, focusing in particular on the former one. Instead of going into mathematical procedures about different varieties of models, focus was located on their logical structure, in terms of assumptions and conclusions. A model is a logical machine for deducing the latter from the former. If models are based on fundamental physical laws, then it may be reasonable to treat the model as ‘predictive’, in the sense that it is not subject to falsification and we can rely on its conclusions. However, at the molecular level, models are more often derived from phenomenology and conjectures. In this case, the model is a test of its assumptions and must be falsifiable and yields biological insights . The fact that these tools describe interactions between many biological components and emergent dynamics is due to the complex relationships between them, question whether these tools can be integrated into the modeling of complex cell-free systems. Experimentally validated computational tools created for both in vivo and in vitro systems, with the aim of build biological components, synthetic biologists suggest to bridge the gap between the understanding of complex biological networks and main biochemical processes by comparing modeling algorithms for both systems. Lewis et al, (2014) have recently proposed a framework for synthetic biologists to build novel artificial cellular systems and to identify underserved research areas for computational model development (Figure 1). The computational tools described by this research group establish a mathematical comparison between in vivo and in vitro biological phenomena. As the field of biology becomes increasingly quantitative, in vitro reactions remain a powerful tool for biologists, and, the plasticity of cell-free systems to test model predictions under minimal conditions is of relevance. This expert research group envise that studies of cell-free and in vivo synthetic systems will reveal cryptic non-genetic factors, network structures, and spatial organization of cellular components that may modulate robustness of synthetic biological systems .
There is a broad kind of models available for synthetic biologists and some areas of potential growth identified for researchers interested in developing tools for cell-free systems.
1-Deterministic models. These models usually consist of differential equations which predict the kinetics of a biological network based on its initial conditions and on past dynamics of the system . They have been used to reproduce synthetic gene networks, including inverters, switches, band-pass filters, multi-cellular networks, and oscillators and have also been applied to simulate the behavior of tumor-invading bacteria, prokaryotic circuits capable of producing artificial analog computation and a transcriptional oscillator [149-151]. These models make use of Michaelis-Menten equations to describe each Chemical reaction
For the modeling of in vivo systems, a baseline level of expression is typically integrated to model leaky activity of promoters. Conversely, cell-free systems have well-defined parameters, easily controlled inputs, and fewer unknown interactions. Thus, cell-free systems may be more precisely simulated than in vivo reactions using deterministic models. These cell-free systems can perform a lot of the same functions of natural organisms with circuits including oscillators, switches and logic elements [152,38,153]. The construction of computational models of in vitro systems can also provide insights into the effects of network architecture on the dynamic behavior of genetic circuits. Earlier work has shown that biological pathways can achieve oscillatory behavior via bi-stable, hysteretic loops, and demonstrated in vitro that these mechanisms could be used in living systems to control the transition to the mitotic phase in embryogenesis . Afterward research into the modeling of synthetic in vitro transcriptional oscillators was used to determine the optimum system parameters required for continuous circuit behavior . Later, this same model was applied to simulate the behavior of an in vitro oscillator after compartmentalization in emulsion droplets and was found to exactly represent the trend observed in individual encapsulated circuits . Models of in vitro systems are useful to explore the impact of biological phenomena that are absent in reconstituted systems. For example, molecular crowding has been included into models of gene expression to explain some of the wanted properties of biological systems. It was demonstrated that molecular crowding, either induced by dextran as crowding agent or by coacervation of encapsulated circuits which greatly increase the expression rate and total protein production of in vitro systems [65,66].
2-Stochastic modeling. The effect of random fluctuations was studied; stochastic models of cellular processes can be formulated following the master equations. On behalf of in vivo systems, noise is due to intrinsic and extrinsic factors. The variation caused by incomplete distribution of reactants within a system is the extrinsic noise, whereas intrinsic noise is the variation caused by the discrete nature of small-scale chemical reactions . A deep impact on biological systems was observed due to both classes of noises during replication, during variation observed at small reaction volumes within a cell and by bursts of translation caused by limited transcriptional activity [156-158]. Stochastic models have been applied to understand: sporulation dynamics of B subtilis, heftiness of a genetic circuit in response to divergent environmental conditions, exoprotease levels in bacterial populations and control of a bacterial population composition with a gene circuit [159-162]. Although in vitro systems are minimal, which seems that should simplify the development of computational models; due to this minimalistic, in vitro systems do not contain intrinsically the mechanisms of natural cells that could facilitate a strong behavior. The absence of these mechanisms could augment the sensitivity of in vitro systems to non-genetic factors. Among them, partial degradation products, stochastic variation at femtoliter volumes, and molecular crowding can be mentioned [37,156,66]. Additionally, in vitro systems lack cellular infrastructure, sub-cellular compartments, transport proteins, and a replication cycle, which seems to complicate the application of computational tools produced for natural cells to in vitro systems, needing the development of stochastic models to predict and control noise
In cell-free systems. In cell-free expression, the process of encapsulation, could be simulated using stochastic models. It was reported that during the compartmentalization of the pure system in small liposomes, the distribution of reactants between compartments does not follow as previously described a Poisson distribution . On the other hand, in vitro systems encapsulated in larger liposomes, predicted resulting reactant concentrations via a stochastic model following the Gillespie algorithm . Another stochastic variation in an encapsulated in vitro system detailed the behavior of a compartmentalized transcriptional oscillator. Although the performance of the circuit within an emulsion was variable and was shown to be the result of intrinsic noise of the system acting stochastically at small volumes, the model of the reaction demonstrated that intrinsic noise was insufficient to describe the variability exhibited by the system; instead, the dominant cause of the deviation from the deterministic model was more probably to be extrinsic noise caused by heterogeneous distribution of reactants within the emulsion . This discrepancy from the deterministic model was also reported during replication when cytoplasm components are not equally distributed between daughter cells [157,165,152]. The significant effect of extrinsic noise on this minimal system suggests that reactant distribution is an important factor in encapsulated in vitro reactions, which could be ignored when considering the source of stochastic variation within in vivo systems [157,165,152]. Molecular crowding increases expression levels in vitro. The study of molecular crowding revealed how molecular distribution can impact stochastic variation in vitro. Stochastic models of in vitro systems have also shown decreased variation of gene expression rates in the presence of molecular crowding conditions [166,65,66]. 3-Exploratory models. This type of model is used to guide the design of biological circuits. To imitate strategies, they changed from the engineering disciplines to automated biological design, combining known modules into more complicated architectures . Firstly, an automated design algorithm registered a library of biochemical parts with defined kinetic parameters and interactions described as ordinary differential equations . Next, the algorithm selected certain parts and arranged them into motifs [169-173]. Exploratory models can also be used to analyze the effect of intrinsic noise, extrinsic noise, and variation of kinetic parameters on synthetic genetic machinery [174,160]. These exploratory models have been used to design in vivo pathways, such as a Boolean network of transcriptional switches implemented in yeast, a multiplexor circuit in E. coli and an inducible bi-stable system of fluorescent reporters in mammalian cells [173,175,170]. Transcriptional networks in vivo and in vitro have the same circuit architecture and basic components. The parameters used by these automatic genetic design programs are actually determined in vitro, which would make the assembly of in vitro circuits more accurate than in vivo circuits. Alternatively, genetic design programs optimized for in vivo conditions may not account for the chemical conditions experienced by in vitro expression systems. It was reported that these type of models for in vitro pathways could considerably speed the assembly of cell-free circuits, and provide platforms for testing hypotheses of how self-repair and proofreading complex processes influence dynamical behavior of synthetic circuits. In addition, these automatic in vitro network assemblers could also form the fundamental tools for creating an integrated model of artificial cells .
4-Molecular dynamic models. All-atom (AA) and coarse-grained (CG) models are categorized by the level of detail. The degree of detail addressed by each algorithm is determined by force fields, consisting of a set of mathematical functions and parameters that describe interactions between molecules that construct these models. Different force fields, such as CHARMM, GROMOS, AMBER, and MARTINI can be described in detailed reviews or some comparisons [176-178]. The selection of AA or CG model depends on the context of the research study. In the case that the detailed atom-atom interactions are not required, then, CG models are appropriate and the computational cost must be taken into account. However, as computational hardware and software continue to progress, it is possible to use AA models to describe dynamics over a longer time range . All-atom models. AA-models are useful tools in lipid membrane simulation, where every atom of the solute and solvent in the system is explicitly simulated. Thus, when applied to the simulation of lipidic bilayers, these models can provide fine details at the molecular level. AA-models are often limited to small-scale simulations due to the computational costs [176,178]. This type of model has been applied to simulate membrane defection by an electrical field and pore-forming agents [180-182]. An AAmodel illustrated that pore formation and closure could be induced by ionic charge imbalance . Coarse-grained models. CG models are simpler and contain fewer details than AA-models. These models consist of groups of atoms representing “beads”, which potentially reduce the resolution of the simulation and decrease computer resources required to simulate AA models [184,176]. As a result, CG models are preferable when simulating large scale dynamics where atomic details may not be critical, to simulate lipid phase behaviors, such as phase separation and phase transition [185,186]. Phase changes can alter mechanical properties of membranes, such as fluidity and rigidity. Thus, phase behaviors may need to be considered when designing the Shell to achieve certain mechanical properties. CG-models have also been used to study interactions between lipid bilayers and other molecules. The effect of nanoparticles on fluid-gel transformation of lipid bilayers was studied. Nanoparticles were shown to induce local disorder of a lipid bilayer and delay the transformation of the lipid bilayer from fluid to gel states . Other computational studies have shown that amphiphilic nanoparticles and nanotubes interact with lipid membranes to form controllable pores and channels [187,188]. CG models have been first used to perform large time-scale simulation and then switched to AA models by mapping “beads” to single atoms. A large amount of simulation tools are focused on dynamics of the lipid membrane itself. So far, models integrating the Shell and the Processor/ Output modules have not been established. The main obstacle for this integration is the difficulty of connecting physical concepts used in membrane modeling with chemical dynamics utilized in transcription/ translation modeling. It is worth mentioning that the described models only consider atomistic scale of membrane dynamics, but integrated simulation should be necessary for predicting dynamics of artificial cells. Further than the atomistic scale, mesoscopic models (about 0.1- 10 μm) where individual molecules are CG to single fluid volume are potential options for simulation of lipid bilayers . Some studies have also combined AA and CG models to do long, yet fine time-scale simulation [190,178]. Hybrid models. In all the mentioned models, atomistic scale information is obtained and further “transformed” to lower resolution representations to achieve simulations at larger time- and/or length-scales . A framework has been developed attempting to incorporate physical and chemical methods to simulate cellular functions .
Interestingly, experts predict that computational modeling of interactions between lipid membranes and transcription/translation machinery will provide unique insights into robustness of gene expression and enhance their capacity to control artificial cells . Parameters used in models. The difficulty of modeling in vivo systems stems from the context-dependency of reaction parameters. A wide diversity of equations describing the behavior of synthetic biological systems, parameters of these equations are generally unknown. Among databases for obtaining enzymatic reaction constants, KEGG , BRENDA , SABIO-RK , and ExPASy  can be mentioned. BioNumbers have also collected measurements of biological systems  and have been used in the modeling of a yeast-bacteria ecosystem, in a predictor of anti-microbial protein efficacy and in a computational representation of distributive metabolic networks [196- 198]. The kinetic constants of biological molecules used in modeling in vivo systems are often measured in vitro, however, conditions may not reflect the pH or molecular crowding conditions experienced by those molecules in natural cells. In contrast, these kinetic constants that are quantified in vitro could be directly applied to cell-free reactions, thus creating models with high precision and prediction.
Engineering complexity and refactoring cell capabilities were detailed recently considering that currently is a critical moment for synthetic biology, because the initial fervor for the main achievements attained gives way to a deeper understanding of the complexity of biological systems in order to significantly progress in the applicability of design principles for living organisms . A recent review considered the role of synthetic biology in supporting biosensor technology, reflecting on the features that make it a useful tool for designing and constructing engineered biological systems for sensing application and reporting examples from the literature . Another revision describes the current therapeutic delivery tools, the limitations that hamper their use in human applications, the biological tools and strategies that are at the vanguard of synthetic biology discussing their potential to advance the specificity, efficiency, and safety of the current generation of cell and gene therapies, including how they can be used to confer curative effects improving those of conventional therapeutics . On the other hand, recent discoveries in understanding extracellular electron transfer pathways, and the creation of customized and novel exoelectrogens for biotechnological applications were summarized. Engineering efforts to increase current production in native exoelectrogens, as well as efforts to create new exoelectrogens were described. These approaches will continue to expand and genetically modified organisms will continue to improve the outlook for microbial electrochemical technologies due to the development of genetic tools . A new perspective for the combinatorial biosynthesis of natural products that could reinvigorate drug discovery by using synthetic biology in combination with synthetic chemistry was recently also described .
Lately, approaches for using computational modeling of synthetic biology perturbations to analyze endogenous biological circuits have been developed, with a particular focus on signaling and metabolic pathways. A bottom-up approach in which ordinary differential equations were constructed to model the core interactions of a pathway of interest was reported. Methods for modeling synthetic perturbations that can be used to investigate properties of the natural circuit as well as experimental methods for constructing synthetic perturbations to test the computational predictions have been discussed in detail. In particular, a case study of the p53 tumor-suppressor pathway was presented, illustrating the process of modeling the core network, designing informative synthetic perturbations in silico, and testing the predictions in vivo . In addition, taking into account the demand for accurately quantization the expression of genes of interest in synthetic and systems biotechnology, a quantitative method based on flow cytometry and a super-folder green fluorescent protein was developed for the first time to at single-cell resolution in Streptomyces. This work presents a quantitative strategy and universal synthetic modular regulatory elements, which will facilitate the functional optimization of gene clusters and the drug discovery process in these organisms . On the other hand, searching for alternative strategies as antibiotic therapies become obsolete due to bacterial resistance, mathematical models and simulations guide the development of complex technologies, such as aircrafts, bridges, communication and transportation systems. In this sense models that guide the development of new antibiotic technologies span multiple molecular and cellular scales, and facilitate the development of a novel technology .
Regarding systems biology, developing mechanistic models has become an integral aspect of them, because it is needed to differentiate them from alternative models. Interestingly, “parameterizing” mathematical models has been widely perceived as a challenge, which has spurred the development of statistical and optimization routines for parameter presumption. However, focus is now increasingly shifting to problems that require synthetic biologists to choose from among a set of different models to determine which one offers the best description of a given biological system. In particular, approaches that are both practical as well as built on solid statistical principles are selected for application in systems biology . Finally, computer simulation allows researchers to accelerate the velocity of scientific questions and build a common framework for designing biological networks. in vitro reactions remain a powerful tool for experimental biologists, and as the field of biology becomes ever more quantitative, it is important to take advantage of the plasticity of cell-free systems to test model predictions under simplified and minimal conditions. Experts envisage that studies of cell-free and in vivo synthetic systems will reveal cryptic nongenetic factors, network structures, and spatial organization of cellular components capable to modulate robustness of synthetic biological systems .
Interestingly, mathematical models of biological systems take the form of chemical reaction networks. The relevance of stochasticity methods and to simulate stochastic reaction networks has been recently reviewed. It is worth mentioning that the master equation is a complete model of randomly evolving molecular populations. In this sense, a closure scheme solution has been recently presented for the master equation of chemical reaction networks. Thus, a wide range of experimental observations of biomolecular interactions might be mathematically conceptualized. The authors anticipate that models based on this closure scheme might assist in rationally designing synthetic biological systems .
VGD is member of research career of the National Council of Scientific and Technological Research (CONICET) from Argentina and works in the Area of Biochemistry of Proteins and Glycobiology of Parasites in the Research Department of the National Institute of Parasitology “Dr M. Fatala Chaben”, ANLISMalbrán, Health Department, Argentina.