Institut Jacques-Monod, CNRS, Universités Paris 6 et Paris, Paris cedex 5, France
Received date: September 12, 2013; Accepted date: February 11, 2014; Published date: February 15, 2014
Citation: Ricard J (2014) Emergence of Information and the Origins of Life a Tentative Physical Model. Curr Synthetic Sys Biol 2:109. doi: 10.4172/2332-0737.1000109
Copyright: © 2013 Ricard J. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Current Synthetic and Systems Biology
Life and living systems requires the existence of information. It is no doubt difficult to understand how this information could have spontaneously emerged in the first living cells. In fact spontaneous random association of molecules cannot give birth to information exactly as the random association of letters has a poor probability to generate a word. The idea developed in this paper is that the information required for the building up of the ?primordial cell? does not originate from the association of molecules xi, with p(xi), probability of occurrence, but from the association of xi, molecules possessing conditional probabilities p(xi?yj,yk....). As will be shown in the paper, such a situation implies that molecules of conditional probabilities p(xi?yj,yk....) do not associate randomly but should follow an order defined by the nature of these conditional probabilities. As a conditional information, , is associated with the corresponding conditional probability of occurrence it follows that a sequence of conditional probabilities possesses a global information. Moreover the system can spontaneously generate an information if h(xi?yj,yk....) < h(xi). Hence the global information could have been generated through the ?coupling? of many of these individual mathematical uncertainties. The problem discussed in the present paper is precisely the converse of a transfer of information from a source to a destination in a communication channel, which has been extensively studied but the emergence of new information in a system.
Origins of life; RNA; DNA; Gibbs-Boltzmann statistics; Molecular biology
Life, as we know it today, requires the existence of information. Many biological macromolecules, ribonucleic acids and proteins, are the expression of this information stored in deoxyribonucleic acids. Such a situation raises difficult problems when considered in the context of the origins of life on earth. As a matter of fact, living organisms require the existence of information and, conversely, in the context of the origin of life, information requires the existence of living systems. In order to avoid this vicious circle, one has to imagine an alternative hypothesis, for instance that the information required to explain the origin of the first living systems originates from purely physical-chemical reactions. The aim of the present paper is to discuss the hypothesis where biological information is generated through purely chemical reactions.
If a material entity, say a molecule xi, has a probability of occurrence p(xi), its information, h(xi), is defined as
This expression reduces to
if p(xi), is small. The lower this probability and the larger the information brought about by the occurrence of this event. If now two events, xi and yj, are mutually independent, the probability of occurrence of this couple xi, yj is then
If now xi and yj are not independent but if, for instance, xi interacts with yj, or conversely, one has then
In this expression, and are conditional probabilities.
The information corresponding to such a situation is then
As in the classical communication theory, one can define a new function i(xi: yj), that could be called elementary mutual information. One has
Substituting h(xi: yj), by its own expression (4) one finds
Hence, this simple system will generate information if
Under these conditions one has i(xi: yj)<0. If, alternatively,
then one has i(xi: yj)>0. If i(xi: yj)<0, there is emergence of information in the system. If conversely, i(xi: yj)>0 there is conduction of information within the same system. This is the last situation that prevails in classical communication theory [1-6].
Let us consider the elementary process of information transfer
it implies that information is transferred from x to y within the xy “complex”. The condition that allows the communication channel to be a perfect one is that
equivalent to (13)
Under these conditions there cannot exist emergence, but transfer, of information. This is the classical situation that has been studied in details by many authors [1-6]. Now let us assume that x and y interact in such a way that
then expression (13) becomes
With δ > 0, and does not correspond to the expression of a perfect communication channel. It describes what is occurring in a process of emergence of information. Hence it is implicitly postulated in current communication theory [1-7] that one has always
Even though this assumption is perfectly correct for the communication of a message in a communication channel, the situation can be completely different in the context of emergence of information through molecular interactions. This is precisely what is going to be discussed in this paper.
Relation (1) of a previous Section implies that
Under these conditions one has then
and the product of the right-hand side member (equation 15) is
Expression (13) can be rewritten as
or as (21)
As h(x) < 1, it follows that if
the system generates its own information as molecules x and y interact. Under these conditions, as previously pointed out, δ > 0. We find here the situation already described in Section 1. Alternatively, if
molecules x and y do not tend to associate. Relationship (22) implies that the emergence of information originates from an interaction between and y.
This reasoning can be generalized to more complex situations, for instance
If and interact this equation can rewritten as
Equation (21) can be generalized to different values of xi and . One can see that the sign of Δ depends on the sign of
This expression is analogous to the so-called mutual information of the classical information theory  and is to be compared to expressions (22) and (23). It follows that if
I(X:Y) > 0 (27)
the system consumes and propagates information, which is the situation expected to occur in a communication channel. Alternatively, if
I(X:Y) < 0 (28)
the system generates information. This situation is due to the possible interaction occurring between xi and yj.
Classical Gibbs-Boltzmann statistics [8,9] allows one to understand the physical nature of the emergence process. Following the Gibbs- Boltzmann theory, the y molecules are distributed among different energy levels according to the classical relationship
Where p(yj) is the probability of occurrence of y molecules on the jth energy level, Ej and E0 the jth and 0th energy levels, kB and T the Boltzmann constant and the absolute temperature, respectively. It follows from this relationship that h(yj) has an important physical meaning that it does not possess in classical communication theory. One has
and it appears that the information of a molecule is directly related to its energy level. It follows from this reasoning that conditional information, , has a similar meaning i.e.
If the interaction between xi and yj is such that
and the interaction between xi and yj generates the “spontaneous” emergence of information (Figure 1)
The ideas developed in Section 2 offer a general picture of the organization of chemical and biochemical networks, whether they are primitive or not. In the present paper, we are interested to know how physical interactions between elements of a network generate information. If we consider for instance the joint probability of occurrence of such a network.
it appears to possess both kinds of information for it takes account of occurrence of both node probabilities and node interactions. In fact node interactions define the succession of nodes. It then follows that information generated through mutual interactions of nodes can be expressed from the difference
Information content expressed by this difference may be defined by the differences between information and conditional information
and . One has then
One can also notice that if x1, x2, …. interact one can possibly have
It follows from these relationships and from the differences that appear in equations (36) that the production of information by the system is associated with positive values of i(x1: x2: x3: x4). This situation will be obtained if relationships (38) are met. One can conclude from these results that the organization of a network corresponding to the joint probability p(x1, x2, x3,….) relies upon two kinds of factors i.e. the number of nodes and the way these nodes are connected. This leads to define the information stored by networks as
The terms h(xi) are the contributions of the nodes to the global information of the system without taking into account the interactions that may exist between them. The term h(x1, x2, x3,….), or , expresses how the nodes, and their interactions, contribute to this information. It is then evident that i(x1: x2:...) possesses negative values if xi and yj interact and if relationships (38) apply.
Let us consider, for instance, the ideal processes shown in Figure 2. In Figure 2a it is assumed that the intermediate C interacts with A and that D interacts with B. Both C and D play the part of inhibitors. In Figure 2b, C and E are inhibitors of the process. The probability of occurrence of the first network is
Similarly, the process shown in Figure 2b has a probability of occurrence defined by
One can write from equation (40)
It then appears that the system generates information if
A somewhat similar situation is expected to occur for equation (44) and Figure 3. One has
Figure 3: Emergence and storage of information in the two models of Figure 2. a–Two feedback processes exerted through h(xA|xC ) and h(xB|xC) . If h(xA|xC)> h(xA) and h(xB|xD)> h(xB) there is emergence of information. b-The two feedback processes are exerted through h(xA|xC) and h(xC|xE) . If h(xA|xC)> h(xC) and h(xC|xE) there is emergence of information.
As previously, emergence of information will be obtained if
Information that possibly exists in a macromolecule, made up of a succession of smaller molecules, relies upon the sequence of the elements present in this macromolecule. This is precisely what is occurring with “classical” informational biomolecules such as DNA, RNA and proteins. It follows that one cannot expect the random association of different monomers to generate an informational polymer. Hence, one could wonder about the mechanism responsible for the emergence of an “order” in the succession of the elements of a sequence such as . In fact the “position” of an element in the sequence relies upon some of the elements of this sequence. For instance the position of xi in the sequence depends upon that of xj and xk. As will be shown below, it is the nature of the conditional information, or of the conditional probability, that contributes to define the sequence of the multimolecular system that is being synthesized.
Expressions such as h(x1, x2, x3,….), above can possibly be considered a specific information that corresponds to a real, multimolecular, structure. For instance, the two sequences of events presented in Figures 2 and 3 can be expressed by joint probabilities p(A,B,C,D), p(A,B,C,D,E) and their corresponding information
Expressions (48a) and (48b) describe specific sequences of events that define the global information of the corresponding system. Moreover, in this perspective, an expression such as h(xA, xB ,…), can possibly define the information of a multimolecular system.
According to molecular biology RNA and DNA play the part of “templates” that control mutual interactions and respective binding of amino acids according to a certain order that define the information on the corresponding polypeptide chain. In fact one cannot explain the spontaneous random association of monomers to generate information. In the case of protein synthesis, information is transferred from DNA to proteins thanks to specific interactions between tRNAs and amino acids. The basic idea which is developed in the present paper is that information in primordial systems is generated thanks to the properties of conditional probabilities. Thus, for instance, if we consider a population of molecular entities A,B,C,D,E,F defined by their conditional probabilities of occurrence , these entities will form a complex possessing a definite information for its elements possess a definite sequence, namely A,B,C,D,E,F. Moreover one can conclude that the system thus formed has a “circular” structure (Figures 4 and 5). This situation can be explained if the joint probability that defines the system p(A,B,C,D,E,F) is equal to the product
Figure 4: The system h(xA,xB,xC,xD,xE) defines the spatial localization of conditional information as a closed geometric structure. This situation is obtained if any element of the system is dependent upon the other elements as shown (52). Such a situation is expected to occur in the case of the probability of occurrence of a system defined by equation (51). Information is generated through the interaction between the elements of the system.
It is the very nature of the molecular interactions involved in these conditional probabilities that defines the sequence A,B,C,D,E,F.
The corresponding information of the system can then be expressed as
and it is clear that both expressions (49) and (50) represent the global information of a sequence of elements, namely conditional probabilities and conditional informations, respectively.
This reasoning can be applied to the information of the whole system
It follows that
One can realize that the expressions of and have a definite geometric representation (Figures 4 and 5). This representation displays a “closure” as requested in living systems that display an”inside” and an “outside”. It is remarkable that and , in Figures 4 and 5, are a geometric representation of the distribution of information within a multimolecular complex (Figure 4) and how the interactions within the complex affect the informations of its elements (Figure 5). Moreover, this system could attract and bind other “potential biomolecules” in a definite order thus transferring its information to this new macromolecule. Hence it is tempting to assume that nonbiological systems can spontaneously synthesize molecular edifices similar to those shown in Figures 4 and 5. One could speculate that both of them could possess the information required for the building up of the first “living” systems.
The classical solution of the problem of the origins of life raises a number of difficult questions [10-14]. These difficulties are present in many theories, in particular in the theory of the existence of a primitive RNA world in which the first living organisms on earth were solely made up of RNAs. One of the difficulties of this classical vision is that, according to the literature, no one so far has been able to observe the replication of RNAs in the absence of enzymes.
Any process of emergence of a living system implies that the information carried up by a macromolecule, or a pluri-molecular system, is already present before the first living organism appears. Today information, under the form of DNA molecules, has an obvious biological origin in such a way that we need life to define information of biological systems and we need information to define life. There is some kind of circular statement, in the context of the problem of the origin of life, in defining living systems through their information, and in defining information through living systems.
There have been several attempts at explaining emergence of life from purely chemical events. It is highly probable however that the building up of a “living” system requires the existence of information. We are then faced with the idea that the first living systems were the products of the expression of some information that was not biological in its essence.
There exists today an information theory which is in fact the theory of the communication of a message. The communication of a message is based on the equation.
which is in fact a generalization of equation (26) of the present paper. In expression (54) above, I(X : Y) is the co-called mutual information transferred in a communication channel from a source to a destination. It is then clear that, in this case, I(X : Y) has to be positive, otherwise no information could have ever been transferred from a source to a destination. This implies in turn that H(X) is of necessity larger than . If I(X : Y) were negative this would imply that the system would generate its own information. This is indeed impossible in the case of a communication channel but this is quite possible in the case of a biochemical system involving interacting molecules.
In order for such a system to generate information one has to assume there exists some interaction between xi and yj in such a way that
Under these conditions, the system, which does not play the part of a communication channel, generates its own information under the form of a macromolecular complex that can associate in a definite order other molecules, thus giving birth to a primitive “living” organism. This order is, in fact, defined and imposed by the very nature of the elements of conditional information that constitute the system. It is because the position of xi is defined by the values of yj, which are precisely associated with this xi unit, that the sequence of all the xi possesses an information. As shown above, if the conditional events associate, this association could not be random but of necessity It is, in fact, the conditional character of these events that imposes their sequence as well as the possible “closed character” of the sequence. In the frame of the present theory, i(xi : yj) should adopt negative values. It is then not surprising that physical interactions between xi could generate an information.
According to these theoretical considerations, novel interactions between xi and yj take place and the system becomes more and more complex. This increase of complexity, which has a physical origin, follows a number of requirements defined by a mathematical model. It is clear, in such a system, that both the emergence of information and the spatial organization of the system are predicted by equations of purely physical phenomena.
The morphological shape of the system is also predicted by the nature of the equation that describes the process of emergence. In such systems, both the emergence of shape, of a closure of the system for instance, and information are controlled by purely physical events.