Recurrent Solutions to Dynamics Inverse Problems: A Validation of MP Regression

The paper is focused on a new perspective in solving dynamics inverse problems, by considering recurrent equations based on Metabolic P (MP) grammars. MP grammars are a particular type of multiset rewriting grammars, introduced in 2004 for modeling metabolic systems, which express dynamics in terms of finite difference equations. Regression algorithms, based on MP grammars, were introduced since 2008 for algorithmically solving dynamics inverse problems in the context of biological phenomena. This paper shows that, when MP regression is applied to time series of circular functions (where time is replaced by rotation angle), the dynamics that is found turns out to coincide with recurrent equations derivable from classical analytical definitions of these functions. This validates the MP regression as a general methodology to discover deep logics underlying observed dynamics. At the end of the paper some applications are also discussed, which exploit the regression capabilities of the MP framework for the analysis of periodical signals and for the implementation of sequential circuits providing periodical oscillators.


Introduction
Let us start by considering the following general Dynamics Inverse Problem (DIP). Given a time series n (X[i] R i t N) X = ∈ < ∈ of vectors of finite dimension n over the set R of reals, indexed over a subset of t natural numbers (time points), find a state function δ: R n → R n such that: where δ i is the i-iteration of δ, with δ 0 (X)=X and δ i+1 (X)=δ(δ i (X)) for every i ε N and X ε R n .
In other words, we pose the problem of finding an autonomous discrete dynamical system over R n with initial state X[0] and a dynamical function δ that is able to provide a given time series, as trajectory originated from a given initial point. When such a solution is found (possibly within an approximation threshold), we can say that a given trajectory in R n is represented in terms of a suitable next-statefunction defined in the same space. Dynamics inverse problems were the beginning of modern science aimed at discovering the motion laws of planets around the sun. In general, a typical problem of mechanics is the determination of motion laws: from the observed motion to the underlying equations deduced from the knowledge of the forces acting on bodies. The approach we will outline where is similar, but here the forces as\causes of motion changes" are not assumed. Rather, we are interested in inferring a possible (approximate) internal logic regulating how (instead that why) changes of variables are cooperatively organized in a given system. This of course is a solution less precise and less explicative than the classical approach (usually based on ordinary differential equations). However, very often, in very complex systems with poor Information about the causes acting in a system, it is the only possibility that can be realistically investigated. The dynamics inverse problem was investigated in the context of MP theory developed in the last ten years (see, for example, [1][2][3][4][5][6]. In this paper, when we refer to MP regression, we intend the LGSS (basic) MP regression algorithm, integrating algebraic and statistical regression methods in the MP framework [7][8][9][10], that resulted completely satisfactory in all the considered cases. Dynamics inverse problems are crucial in many applicative contexts, where solving a DIP means discovering a logic relating the variables of a phenomenon, by passing from a time external manifestation to a state internal causation [11][12][13]. Here, we present an MP solution of a mathematical problem related to circular functions: which are the MP recurrent equations discovered by the MP regression from time series of sine and cosine? We will show that the solutions, expressed by MP grammar for sine and cosine, found by means of LGSS algorithm, correspond to formulae deduced from elementary trigonometric identities or Taylor series. In a sense, this is an evidence of the validity of LGSS algorithm in inferring internal regularities underlying a given dynamics. An MP grammar G is given by a structure [10]: where: • X is a finite set of real variables; • R is a finite set of rules and each rule r € R is expressed by α r → β r with α r ; β r multisets over X (functions assigning a multiplicity to every variable); ∈ is the set of regulators, or flux functions : n r R R ϕ → for every r € R, where R n is the set of possible states of X, that is, column vectors of R n , which we continue ambiguously to denote by X (a state assigns to every variable a real value, in a given order). Any regulator φ r associates a flux value u to every state of R n . A flux u of a rule r establishes an updating of the current state of variables, by decreasing of u . α r (x) the value of any variable x occurring in α r and by increasing The term grammar is due to two reasons. First, rules α r → β r become rewriting rules, of Chomsky grammars considered in Formal Language Theory [14], when α r and β r are strings (multisets are strings where the occurrence order is not relevant). Secondly, an MP grammar generates, from its initial state, a sequence of states. Analogously, a Chomsky grammar generates a set of strings from an initial string. The attribute MP comes from the initial application context suggesting MP grammars [1] extending Paun's P systems (multiset rewriting rules distributed in compartments) [15][16][17]. Applications in modeling biological systems were developed in the last years [3,10,13,[18][19][20][21][22].
MP grammars have an intrinsic versatility in describing oscillatory phenomena [13,14]. The schema of MP grammars given in Table 1, called bicatalyticus [10], has an input rule r 1 and an output rule r 3 (incrementing and decrementing the variable x and y, respectively). Both rules are regulated by the same variable that they change (a sort of autocatalysis), while the transformation rule r 2 from x to y is regulated by both variables (bicatalysis). An MP grammar of this type provides a simple model for predator-prey dynamics firstly modeled in differential terms by Lotka and Volterra [23]. The model assumes a simple schema ruling the growth of the two populations x; y (preys and predators): preys grow by eating nutrients taken from the environment (according to some reproduction factor) and die by predation, while predators grow by eating preys and naturally die (according to some death factor). When predators increase then preys are more abundantly eaten and therefore they decrease. But prey decrease results in a minor food for predators which start to decrease (by providing a consequent increase of preys). This means that the increase of predators produces, after a while, their decrease (and symmetrically, a corresponding inverse oscillation happens for preys). This oscillatory mechanism of the sizes of the two populations is displayed in Figure 1 by the dynamics of the MP grammar of Table 1 with the following regulators: The following matrix: is the stoichiometric matrix of grammar in Table 1, which is constructed by 3 column vectors of integers R 1 , R 2 , R 3 expressing the differences between the coefficients of left and right occurrences of variables (a rule 3x → x+y would be (-2, 1) because 1x-3x=-2x and y-0y=1y). Figure ( ) where: ⊕ is the vector sum; A is the stoichiometric matrix, U(X) is the column vector of fluxes r ( (X) ) r R ϕ ∈ Any MP grammar has an equivalent grammar (providing the same dynamics) where rules have an input or output form (with the empty multiset ϕ, on the right or on the left of the rule). In these grammars the reciprocal influence of variables is realized by means of regulators. Moreover, any first-order recurrent systems of equations (where the value of a variable at step i+1 may depend only by variables at step i) identifies an MP grammar in input/output form, and vice versa [10].

The LGSS algorithm of MP regression
Let us consider a given DIP, and let us fix a set of m rules over n variables, then the Log-Gain Stoichiometric Stepwise algorithm (LGSS) [7][8][9][10] yields an MP grammar, with the given rules and the initial state X [0] of the time series X, generating X (within an approximation threshold). To this end, we chose a set of d primitive functions g 1 ,…, g d (at most of n variables), called regressors, by means of which regulators are expressed, as linear combinations of some of them. The selection of the regressors corresponding to each regulator is realized by means of suitable statistical tests based on Fisher's F-distribution within a strategy of Stepwise regression [25,26]. The determination of the coefficients of the chosen regressors is computed by combining the method of Least Square Evaluation with a procedure of Stoichiometric Expansion realized by means of suitable algebraic operations on matrices. The name LGSS is due to the use of Log-gain principle [3] for assigning scores in the choice of regressors. In fact, this is a crucial point of the procedure. More technical details on this algorithm can be found in [7][8][9][10]].

An MP Grammar for Sine and Cosine
Sine and cosine functions can be defined as the values of x and y coordinates, respectively, of a point moving on a circle of unitary radius and center on the origin of xy plane. Circular functions are as old as the Archimedean investigation about the π ratio between circle length and its diameter. As illustrated in [27], Archimedean sine of an angle α is the cord under α, that is, 2 sin (α/2), and Archimedes' investigation about the rectification of circle relies on a kind of bisection formulae (from the cord of an angle to the cord of the half angle) that were  the basis for the approximate evaluation of π. Many classical results of trigonometry, such as addition formulae, are due to Arabian and Indian mathematicians (Bhaskara II, 12 th century). The analytical representation of sine and cosine was one of Euler's jewels [28]. Now we are interested to know if a simple MP grammar exists that generates sine and cosine tables with a very good approximation. To this end, we start from the MP grammar schema of Table 1, which we know to be a good schema for providing an MP oscillator, and then we use LGSS with the following regressor dictionary D = {x k y j |0≤ k ≤ 4, 0 ≤ j ≤ 4} for inferring the right regulator formulae providing a cosine/sine oscillator (sine plays the role of predator and cosine the role of prey). When the time series of sine and cosine, with sampling at 10 -3 rad, are given as input to the LGSS algorithm, in order to find the best regulators of the MP grammar of Table 1 (with regressor dictionary D), we get the following result (x indicate cosine, y indicates sine): where k 1 =0:000999499833375, k 2 =0:000999999833333 and k 3 = 0:001000499833291 (the coefficient estimates are cut to the 15th decimal digits, according to the accuracy of the computer architecture used during the computation). MP grammar (4) provides a very precise sine/cosine oscillator, with maximum absolute error of the order of 10 -14 ( Figure 3).
Let us search for a recurrent definition of sine and cosine by using the trigonometric addition formulae: that is: Let us approximate the value of sin(h) with the value h of the angle. Therefore, the value of cos(h) is: If we substitute these values to sin(h) and cos(h) in equations (6) we obtain: which, finally, gives the following MP grammar (x; y are the variables for sine and cosine, respectively): Proposition 1: The MP grammar (8) coincides with the MP grammar (4) given by LGSS, up to the seventh decimal digit (the corresponding coefficients of the two grammars are the same within an approximation of 10 -7 ).
Proof: Let us modify the stoichiometry of the grammar (8) by reversing the second rule: Now, let us add a new rule (we write it in second position) with  Table 1 with regulators given in (2). : and then, let us move the term h • y from the first rule to the second rule and the term h • y from the third rule to the second rule: This change introduces an error in variable step variations, with respect to those of the original grammar (9), because the flux h • y, which was moved from r 1 to the r 2 , now acts also on the variable y (that is increased of it), and the flux h • x, which was moved from r 3 to the r 2 , now acts also on the variable x (that is decreased of it). Therefore, in order to compensate these extra fluxes, we add the term h • x in the flux of r 1 and the term h • y in the flux of r 3 , by obtaining at end the following MP grammar equivalent to (9): : We show that this MP grammar is dynamically equivalent to the MP grammar (4). In fact, by construction, it has the same stoichiometry as the grammar (4). Moreover, if we set the step h equal to 10 -3 , by substituting this value of h in Equation (12) we get the following grammar, where we get coefficients that coincide with the coefficients of the grammar (4) rounded at seventh decimal digit: that is, we get for the three rules the constants k 1 , k 2 , k 3 of (4) (rounded at the seventh decimal digit). MP grammar (13) permits a very accurate computation of the cosine/sine dynamics. Its accuracy surely increases if we use a smaller step, say 0.000001. However, the grammar (4) provides, with the same step (0.001), a more accurate calculation, which is comparable to the precision accuracy of the computer (precision order of 10 -14 ). This means that LGSS, during its calculation, is able to calculate the regulator formulae with a precision that is bounded only by the precision accuracy of the computer which runs the calculations. We will show, in the next Proposition, that what LGSS gives at step 0.001, corresponds exactly to what can be derived by means of a recurrent definition of cosine and sine based on Taylor series: In fact, if we apply the formula above to cosine and sine we obtain: which, finally, leads to the following recurrent formulae: that, for the sake of simplicity, can be rewritten as follows: Table 2 are represented the values for the coefficients A and B of equations (16) for different orders of Taylor series and considering a step h=10 -3 , which is the step used for computing the MP grammar (4). This means that, by substituting in recurrent equations (16) the values of A and B taken from Table 2, we can obtain a cosine/sine oscillator which approximates the curves of cosine and sine with sampling 10 -3 and approximation increasing as the order of terms considered in the Taylor series. The best approximations, with respect to the available computer precision, are computed by using the values in the last row of Table 2.
Recurrent equations (16) can be rewritten in the following MP grammar: The following proposition shows that this recurrent definition of sine and cosine coincides with the MP grammar (4) found by means of LGSS algorithm. Proposition 2: The MP grammar (17) with coefficients A,B of the fourth line of Table 2 is dynamically equivalent to the MP grammar (4) (the two grammars define the same dynamics).
Proof: Let us write the MP grammar (4) by changing its stoichiometry, but keeping the same variations of variables at any step. First, we change the direction of r 3 :  : (20) and this last grammar leads to the conclusion that the MP grammar (17) is equivalent to the grammar (4) when: If we use the estimates of k 1 , k 2 and k 3 computed by LGSS for the MP grammar (4), we obtain: A=k 1 -k 2 =k 2 -k 3 =-4:99999958 • 10 -7 B=k 2 =9:99999833333 • 10 -4 which corresponds to the coefficients for A and B of 4th order provided in Table 2.
The proposition above shows that LGSS provides formulae and coefficients estimations that are mathematically well grounded and that provides the maximum power of approximation, according to the precision accuracy reachable by the machine performing the computation.
LGSS computed coefficients corresponding to Taylor terms of 4 th order, because this order provides the precision accuracy of 15 decimal digits. However, now that we are aware of this, we can better explain the approximation loss of grammar (13) with respect to grammar (4). In fact, if we reconsider the evaluation (21) with the values of (13), we obtain: A=k 1 -k 2 =k 2 -k 3 =-5 • 10 -7 B=k 2 =0.001 which corresponds to the coefficients for A and B of 2 th order provided in Table 2. This means that the approximation power of grammar (4) is increased, with respect to grammar (13), of two orders of Taylor series.
If we apply LGSS directly to the stoichiometry of grammar (17), we will obtain the coefficients of the fourth line of Table 2. The following MP grammar: has been computed by LGSS, considering the stoichiometry of grammar (17) and a sampling step of 0.1. Even if the sampling step of this grammar is bigger than the one used for calculating grammar (4), this system provides a cosine/sine oscillator that is more precise. In fact, the maximum approximation error in the first two oscillations is of the order of 10 -15 (Figure 4). It is interesting to see that, thanks to the fact that this time the considered sampling step is bigger, the coefficient estimates computed by LGSS consider more terms of Taylor series. In particular, coefficients of grammar (22) correspond to a Taylor series of

Applications
The LGSS regression described in MP Grammar for Sine and Cosine part demonstrated to be very reliable even when the discretization step of our analysis is realized at a very macroscopic level, for example by considering sine and cosine at angle increments of 2 rad. In such a case, the values computed by the MP grammar deduced by LGSS were determined with the best possible precision and with a surprising computational speed ( Figure 5). In fact, the greater is the angle increment, the higher is the order of terms of Taylor series used by the grammar given by LGSS. And moreover, the greater is the angle increment, the smaller is the number of steps necessary for generating sine and cosine values (with a decreasing of error propagation along the iteration). The charts given in Figure 6 refer to a set of 400 MP models providing sine/cosine dynamics with sampling step ranging from 0.01 rad to 4 rad. For each model is given the mean square error (MSE) and the order of Taylor series reached by the coefficient estimates computed by LGSS. Thanks to the high performance of LGSS, the entire set of models has been computed and simulated in less than a minute by means of a standard laptop with a dual core CPU and 4 GB of memory. An explanation of this very good performance of MP grammars is due to their intrinsic capability of combining a sort of native built-in integro-differential strategy. In fact, in a differential representation of a dynamics, when solutions are found, analytically or by numerical integration, the behavior of variables at a given discrete time sampling can be obtained by integrating the differential solution in the (macroscopic) time interval of the time step. The MP solutions given by LGSS provide directly this result by means of a very precise evaluation of regulators responsible for the observed dynamical variations between the two extremes of the step interval. The results presented till now apply not only to standard circular functions, but also to any harmonics cos (n • i) and sin(n • i), with n € R. In fact, from Equations (14), (15) and (16) This means that MP grammar (17) has the capability of generating sine and cosine curves regardless of their frequency and that we can use LGSS to compute the right values for A and B. Now, since we know from Fourier analysis that any periodical signal can be represented by a combination of simple sine waves with different frequencies, this fact leads to important applications in using LGSS for generating recurrent equations that exhibit periodical signals. Moreover, since MP grammar (17) has linear regulators, it is easy to translate it in a combinatorial circuit (Figure 7). Such a circuit can be then incorporated in a sequential one, like that represented in Figure 8, which provides an implementation, in terms of hardware, of the MP system inferred by LGSS. This fact indicates that it is possible to define a methodology, based on LGSS regression that permits to implement sequential circuits of periodical oscillators. A paper is in preparation that investigates along this direction.

Conclusions
The results presented in this paper focus on a new perspective of considering recurrent equations. Usually, their investigation is aimed at finding analytical methods to solve them or at determining properties of the discrete dynamics that they express [29]. On the contrary, here  Figure 5: An example of MP dynamics providing sine/cosine dynamics at a very big sampling step (sampling of 2 rad.). For increasing the readability of the dynamics, a sine/cosine oscillatory pattern has been added to the background by dashed lines. Even if the considered sampling step is very big, the dynamics is very precise (the maximum approximation error in the first two oscillations is of the order of 10 -16 , the coefficients of the corresponding grammar, computed by LGSS, correspond to a Taylor series of 23 th degree).

Order
A B  we do not cope with their solutions, because their intrinsic algorithmic (iterative) nature provides a direct computation of their dynamics: it is enough to put them in the canonical form Xn+1=F(Xn) and iteratively apply the right side of the equation, starting from an initial value (every equational recurrence can be expressed in a first order recurrent form). Of course, in this way the dynamics at step n can be computed only after computing it in all the steps preceding n, but, in most cases, this is not a real limitation if the computation is performed automatically. Using MP grammars, recurrent equations are constructed by assigning regulators to MP rules, that is, by combining the action of a number of regulators that encode abstract forms of mutual interactions among variables. In this perspective, regulators replace forces that in classical mechanics are the causes of observed motions. In fact, regulators may be related to a big number of unknown forces, very difficult to individuate and to discriminate, therefore, they express abstract entities of rational and compact reconstruction of the internal logic underlying an observed dynamics. When the complexity and the indetermination of systems do not allow us other ways of analysis, this could be an important chance to the comprehension of phenomena. In this paper we deduce a sort of proof of concept of the ability of MP regression to discover the deep logic underlying a phenomenon. Namely, LGSS implicitly applied trigonometric formulae and Taylor series, with the best possible accuracy and a powerful computational efficiency. We would like to stress that LGSS automatically calculates not only the values of the coefficients, but also the form of linear combinations providing regulators. This provides a big advantage with respect to other formalisms, such as GMA-systems [30], which reach a similar power of approximation, for circular functions, only because the form of differential equations, on which they are based, are constrained to fit with the form of Taylor series. Conversely, the exibility of LGSS allows us to find the best regulators approximating the internal logic of the observed dynamics, without imposing a preliminary form. In papers [10- 13,21,22] where MP theory was already successful applied to biomedical problems, very good approximations are obtained even when the regulators have a form that is different from that corresponding to Taylor series. Finally, beside the wide range of applications of the MP approach to the analysis of complex phenomena, in applications section some new applications are introduced that exploit the regression capabilities of LGSS for the analysis of periodical signals. In particular, at the end of the section a new application is suggested, that focus on the definition of a methodology, based on LGSS regression, for implementing sequential circuits of periodical oscillators. Sampling (rad) Mean square error Taylor series order reached by LGSS estimates Sampling (rad) Figure 6: Mean square error (MSE) and order of Taylor series reached by the LGSS estimates for a set of 400 MP models providing sine/ cosine dynamics with sampling step ranging from 0:01 rad. to 4 rad.   Figure 7 for obtaining a fully functional sine/cosine oscillator. D-dashed lines represent single bit lines; components depicted in blue identify n-bit memories.