## Abstract

This paper discusses a theoretical method for the “reverse engineering” of networks based solely on steady-state (and quasi-steady-state) data.

## Introduction

A central concern of systems biology is that of elucidating the mechanisms and overall architecture underlying the observed behaviour of biomolecular networks. In this context, the “reverse engineering problem” concerns itself with the unravelling of the web of interactions among the components of protein and gene regulatory networks, so as to map out the direct or *local* interactions among components. These direct interactions capture the topology of the functional network.

An intrinsic difficulty in capturing these direct interactions, at least in intact cells, is that any perturbation to a particular gene or signalling component -using tools such as traditional genetic experiments, RNAi (RNA interference), hormones, growth factors, or pharmacological interventions- may rapidly propagate throughout the network, thus causing *global* changes which cannot be easily distinguished from direct effects. Thus, a major goal in reverse engineering is to use these observed global responses -such as steady-state changes in concentrations of active proteins, mRNA levels, or transcription rates- in order to infer the local interactions between individual nodes.

One approach to solving this global-to-local problem is the “unravelling,” or “Modular Response Analysis” (MRA) method proposed in [1] (see also [2]) and further elaborated upon in [3–6] (see [7,8] for reviews). The MRA experimental design compares the steady states which result after performing independent perturbations with each “modular component” of a network. These perturbations might be genetic or biochemical. For example, in eukaryotes they might be achieved through the down-regulation of mRNA, and therefore protein, levels by means of RNAi, as done in [9]. That work employed MRA in order to quantify positive and negative feedback effects in the Raf/MEK (MAPK/ERK kinase)/ERK (extracellular-signal-regulated kinase) MAPK (mitogen-activated protein kinase) network in rat adrenal pheochromocytoma (PC-12) cells; using the algorithms from [3] and [4], the authors of [9] uncovered connectivity differences depending on whether the cells are stimulated with epidermal growth factor (EGF) or instead with neuronal growth factor (NGF).

This paper presents a mathematical synopsis of the MRA method.

### Problem formulation

We assume that there are *n* quantities *x _{i}*(

*t*) that can be in principle measured, such as the levels of activity of selected proteins, or the transcription rates of certain genes. Later, we discuss how to weaken this assumption, so as to be able to include systems for which some variables are hidden from measurement. These quantities are thought of as state variables in a dynamical system, and are collected into a time-dependent vector

*x*(

*t*)=(

*x*

_{1}(

*t*),…,

*x*(

_{n}*t*)). The dynamical system is described by a system of differential equations: or, in more convenient vector form, (dot indicates time derivative, and arguments

*t*are omitted when clear). We will assume that

*m*≥

*n*. The

*p*’s are parameters, collected into a vector

_{i}*p*=(

*p*

_{1},…,

*p*). These parameters can be manipulated, but, once changed, they remain constant for the duration of the experiment. An example would be that in which the variables

_{m}*x*correspond to the levels of protein products corresponding to

_{i}*n*genes in a network, and the parameters reflect translation rates, controlled by RNAi. Another example would be that in which the parameters represent total levels of proteins, whose half-lives are long compared with the time scale of the processes (such as phosphorylation modifications of these proteins in a signalling pathway) described by the variables

*x*. Yet another example would be one in which the parameters represent concentrations of enzymes that control the reactions, and whose turnover is slow.

_{i}The ultimate goal is to obtain, for each pair of variables *x _{i}* and

*x*, the relative signs and magnitudes of the partial derivatives which quantify the

_{j}*direct*effects of each variable

*x*upon each variable

_{j}*x*. These partial derivatives are the entries of the Jacobian matrix with respect to

_{i}*x*of

*f*(which is assumed to be continuous differentiable). See Figure 1.

The entries of *∂f _{i}*/

*∂x*of the Jacobian

_{j}*F*are functions of

*x*and

*p*. The steady-state version of MRA attempts to estimate this Jacobian when

*x*=

*x̄*is an “unperturbed” steady state attained when the vector of parameters has an “unperturbed” value

*p*=

*p̄*. The steady-state condition means that

*f*(

*x̄*,

*p̄*)=0. Ideally, one would want to find the matrix

*F*, since this matrix completely describes the influence of each variable

*x*upon the rate of change of each other variable

_{j}*x*. Unfortunately, such an objective is impossible to achieve from only steady-state data, because, for any parameter vector

_{i}*p*and associated steady-state

*x*,

*f*(

*x, p*)=0 implies that Λ

*f*(

*x, p*)=0, for any diagonal matrix Λ= diag (λ

_{1},…, λ

_{n}). In other words, the best that one could hope for is for steady-state data to uniquely determine each of the rows of

*F only up to a scalar multiple*. For example, if we impose the realistic condition that

*F*≠0 for every

_{ii}*i*(these diagonal Jacobian terms typically represent degradation and/or dilution effects, and are in fact negative), one could hope to have enough data to estimate the ratios

*a*/

_{ij}*a*for each

_{ii}*i*≠

*j*. Note that

*F*is the same as the gradient ∇

_{i}*f*of the

_{i}*i*th coordinate

*f*of

_{i}*f*, evaluated at steady states.

The critical assumption for MRA, and indeed the main point of [1,3,10], is that, while one may not know the detailed form of the vector field *f*, often one does know which parameters *p _{j}* directly affect which variables

*x*. For example,

_{i}*x*may be the level of activity of a particular protein, and

_{i}*p*might be the total amount (active plus inactive) of that particular protein; in that case, we know that

_{i}*p*only directly affects

_{i}*x*, and only indirectly affects the remaining variables.

_{i}The steady-state MRA experimental design consists of the following steps:

1. measure a steady state

*x̄*corresponding to the unperturbed vector of parameters*p̄*;2. separately perform a perturbation to each entry of

*p̄*, and measure a new steady state.

The “perturbations” are assumed to be small, in the sense that the theoretical analysis will be based on the computation of derivatives. Under mild technical conditions, this means that a perturbed steady state can be found near *x̄*. Note that there are *m*+1 experiments, and *n* numbers (coordinates of the corresponding steady state) are measured in each. In practice, of course, this protocol is repeated several times, so as to average out noise and obtain error estimates, as we discuss later. For our theoretical analysis, however, we assume ideal, noise-free measurements, and so we may assume that each perturbation is done only once.

Using these data (and assuming that a certain independence condition, which we review later, is satisfied), it is possible to calculate, at least in the ideal noise-free case, the Jacobian of *f*, evaluated at (*x̄*, *p̄*), except for the unavoidable scalar multiplicative factor uncertainty on each row.

The obtained results typically look as shown in Figure 2, which is reproduced with permission from [9]. The authors of that paper used MRA in order to infer positive and negative feedback effects in the Raf/MEK/ERK MAPK network in PC-12 cells, employing perturbations in which total mRNA, and thus protein, levels are down-regulated by means of RNAi. The numbers in the arrows in Figure 2 have been normalized to −1’s in the diagonal of the Jacobian. An example of the raw data for the algorithm is provided by Figure 3, which shows the level of active (doubly phosphorylated) ERK1/2 when PC-12 cells have been stimulated by EGF and NGF. (The Figure shows only responses in the unperturbed case. Similar plots, not shown, can be derived from the data for the perturbation experiments given in [9].) The response to NGF stimulation allows the application of the steady-state MRA method, and leads to the results shown in the right-most panel in Figure 2.

However, the plots in Figure 3 indicate that, in certain problems, steady-state data cannot be expected to provide enough information, even for only finding the Jacobian rows up to multiplicative factors. Such a situation occurs when the system *adapts* to perturbations. In Figure 3, notice that the steady-state response to EGF stimulation is (near) zero (this holds for perturbed parameters as well, not shown). Thus, measuring steady-state level of activity of ERK1/2 after parameter perturbations, in the EGF-stimulated cells, will not provide non-trivial information. One needs more than steady-state data. A variant of MRA, which allows for the use of general non-steady-state, time-series data was developed in [3]. However, that method requires one to compute second-order time derivatives, and hence is especially hard to apply when time measurements are spaced far apart and/or are noisy. In addition, as shown for 5 min and 15 min NGF stimulation by the middle and right-most panels in Figure 2, the relative strengths of functional interactions may change over time, so that a time-varying Jacobian may not be very informative from a biological standpoint. An appealing intermediate possibility is to use quasi-steady-state data, meaning that one employs data collected at times when a variable has been observed to attain a local maximum (peak of activity) or a local minimum. Indeed, this is the approach taken in [9], which, for EGF stimulation, measured network responses at the time of peak ERK activity (approximately 5 min), and not at steady state. The left-most and middle panels in Figure 2 represent, respectively, the networks reconstructed in [9] when using quasi-steady-state data (at approximately 5 min) for EGF and NGF stimulation.

We next describe both the steady-state and quasi-steady-state reconstruction problems.

## Mathematical details

We assume given a parameter vector *p̄* and state *x̄* such that *f*(*x̄*, *p̄*)=0 and so that the following generic condition holds for the Jacobian of *f*: . Therefore, we may apply the implicit function theorem and conclude the existence of a mapping φ, defined on a neighbourhood of *p̄*, with the property that, for each row *i*,
and (φ*p̄*)=*x̄* (and, in fact, *x*= φ(*p*) is the unique state *x* near *x̄* such that *f*(*x, p*)=0).

We next discuss how one reconstructs the gradient ∇*f _{i}*(

*x̄*,

*p̄*), up to a constant multiple. (The index

*i*is fixed from now on, and the procedure must be repeated for each row

*f*.) We do this under the assumption that it is possible to apply

_{i}*n*−1 independent parameter perturbations. Mathematically, the assumption is that there are

*n*−1 indices

*j*

_{1},

*j*

_{2},…,

*j*

_{n−1}with the following two properties:

(a)

*f*does not depend directly on any_{i}*p*: ∂_{j}*f*_{i}/∂*p*_{j}≡0, for*j*∈{*j*_{1},*j*_{2},…,*j*_{n−1}}, and(b) the vectors

*v*=(∂φ/∂_{j}*p*)(_{j}*p̄*), for these*j*’s, are linearly independent.

Assumption (a) is structural, and is key to the method and non-trivial, but assumption (b) is a weaker genericity assumption. We then have, taking total derivatives in (eqn 1):
Thus, the vector ∇*f _{i}*(

*x̄*,

*p̄*) which we wish to estimate, and which we will denote simply as

*F*, is known to be orthogonal to the

_{i}*n*−1 dimensional subspace spanned by {

*v*

_{1},…,

*v*

_{n−1}}. Therefore, it is uniquely determined, up to multiplication by a positive scalar. The row vector

*F*satisfies where Σ is defined as the

_{i}*n*×(

*n*−1) matrix whose columns are the

*v*’s. Generically, we assume that there is no degeneracy, and the rank of Σ is

_{i}*n*−1. Thus,

*F*can be computed by using Gaussian elimination, as any vector which is orthogonal to the span of the columns of Σ. Another way to phrase this is to say that

_{i}*F*is in the (one-dimensional) left nullspace of the matrix Σ, or that (if nonzero) the transpose of this gradient can be found as an (any) eigenvector associated to the zero eigenvalue of the transpose of Σ.

_{i}### Handling noise

We next briefly discuss how to modify the algorithm to account for repeated but noisy measurements. In principle, such noise may be due to combinations of internal sources, such as stochasticity in gene expression, external sources affecting the process being studied, or measurement errors. Our discussion is tailored to measurement noise, although in an approximate way may apply to internal noise; however, the effect of internal noise on MRA has not been studied in any detail.

In practice, one would estimate not merely the results of just *n*−1 perturbation experiments, but many repetitions, collecting the data into a matrix Σ^{#} whose columns are derived from the different experiments. We will think of each column of Σ^{#} as having the form *v* + *e*, where *v* is a vector (∂φ/∂*p*_{j})(*p̄*), for some parameter *p*_{j} for which *f*_{i} does not depend directly on *p*_{j}, and where *e* is an “error” vector. In matrix notation, Σ^{#}=Σ+*E*, where *E* denotes an error matrix. Note that eqn (2) implies that Σ has rank *n*−1. On the other hand, because of noise in measurements, Σ^{#} will have full rank *n*, which means that there is no possible nonzero solution *F*_{i} to eqn (2) with the data matrix Σ^{#} used in place of the (unknown) Σ. So, we proceed as follows. Assuming that the signal to noise ratio is not too large, the experimental matrix Σ^{#} should be close to the ideal (noise-free) matrix, Σ. The best least-squares estimate of Σ, in the sense of minimization of the norm of *E*, is obtained (Eckart-Young Matrix Approximation Theorem [11]) by a singular value decomposition (SVD) of Σ^{#}, as follows. An SVD of Σ^{#} is a factorization Σ^{#} = *UMV*^{T}, where *U* and *V* are (square) orthogonal matrices and *M* is a matrix (of the same size as Σ^{#}) with non-negative numbers on the diagonal and zeros off the diagonal. The diagonal elements σ_{1}≥σ_{2}≥…≥σ_{n} of *M* are the singular values of Σ^{#}, and we have that σ_{1},…,σ_{n−1} are all nonzero, and σ_{n} = 0, in the ideal case, but in general σ_{n} will be nonzero and represents the magnitude of the noise. The columns of *U* and of *V* are, respectively, the left and right singular vectors of Σ^{#}. Then (see details in e.g. [12], Appendix A, or in [13,14]), the matrix Σ of rank *n*–1 for which ‖*E*‖ is minimized is Σ=*UM*_{n−1}*V*^{T}, where *M*_{n−1} is the matrix obtained from *M* by setting the smallest singular value σ_{n} to zero.

We now replace eqn (2) by *F*_{i}Σ^{#}=0, which, because *V* is nonsingular, is the same as *F*_{i} *U M*_{n−1}=0. Under the generic assumption that σ_{1},…,σ_{n−1} are nonzero, this means that, *F _{i}U*=α

*e*

^{T}

_{n}, where α is a scalar and

*e*

^{T}

_{n}=(0,0,…, 0,1). We then conclude that, up to a constant multiple,

*F*

^{T}

_{i}=

*Ue*

_{n}is the right singular vector corresponding to the smallest singular value σ

_{n}.

This procedure can also be interpreted as follows (see [4] for details). If we normalize *F _{i}* to have its

*i*th entry as “−1” (in other words, we normalize the diagonal of the Jacobian to −1’s), then the equation

*F*

_{i}Σ

^{#}=0 can also be written as “

*Az*=

*b*” where

*z*represents the unknown

*n*−1 remaining entries of

*F*

_{i},

*b*is the

*i*th column of Σ

^{#}, and

*A*is the matrix in which this column has been removed from Σ

^{#}. The estimation method outlined above is the “total least squares” or “errors in variables” procedure [13,14]. Statistically, the method is justified if the elements of the noise matrix

*E*are independent and identically distributed normal (Gaussian) random variables. If these entries are normal and independent but have different variances, then one must modify the above procedure to add an appropriate weighting, but in the general non-Gaussian case nonlinear SVD techniques are required; see for instance [15–17].

### Modular approach

Modularity is an important feature of biochemical networks [18,19]. As its name implies, one of the main advantages of the steady-state MRA method is that only “communicating intermediates” in-between “modules” need to be measured. When applying MRA in a modular fashion, only perturbation data on communicating intermediaries are collected, The connectivity strength among a pair of intermediates (such as activated levels of MAPK cascade proteins) can be derived, even if this apparent connectivity it is not due to a direct biochemical interaction. An obvious advantage of the modular approach is that it can be applied regardless the degree of complexity of the network analysed, as “hidden” variables (such as non-activated forms of a MAPK protein, for example) only affect connectivity in an indirect fashion. Thus, functional interactions among communicating variables can be deduced without requiring detailed knowledge of all the components involved.

Let us suppose that the entire network consists of an interconnection of *n* subsystems or “modules”, each of which is described by a set of differential equations such as:
where the variables *x*_{j} represent “communicating” or “connecting” intermediaries of module *j* that transmit information to other modules, whereas the *vector variables* **y**_{j} represent chemical species that interact within module *j*. Each vector **y**_{j} has dimension ℓ_{j}. The integers ℓ_{j},*j* = 1,…, *n* are in general different for each of the *n* modules, and they represent one less than the number of chemical species in the *j*th module respectively. Observe that, for each *j*, the rate of change of the communicating variable depends only on the remaining communicating variables *x*_{i}, *i*≠*j*, and on the variables **y**_{j} it its own block, but does not directly depend on the internal variables of other blocks. In that sense, we think of the variables **y**_{j} as “hidden” (except from the communicating variable in the same block).

We will assume, for each fixed module, that the Jacobian of *G*_{j} with respect to the vector variable **y**_{j}, evaluated at the steady state corresponding to *p̄* (assumed to exist, as before) is nonsingular. The Implicit Mapping Theorem then implies that one may, in a neighbourhood of this steady state, solve
for the vector variable **y**_{j}, as a function of *x*_{1},…, *x*_{n}, *p*_{1},…, *p*_{m}, the solution being given locally by a function
Then, those steady states obtained by small perturbations of *p̄* are precisely the same as the steady states of the new system
where we defined *h*_{j}(*x*_{1},…,*x*_{n},*p*_{1},…,*p*_{m}) as:
From here, the analysis then proceeds as before, using the *h*_{j}’s instead of the *f*_{j}’s. A generalization to the case of more than one communicating intermediate in a module, namely a vector (*x*_{j,1},…,*x*_{j},*k*_{j}), is easy.

### Using quasi-steady-state data

Recalling the discussion in the Introduction regarding the need for a method based on quasi-steady-state data, we next consider the following scenario. For any fixed variable, let us say the *i*th component *x*_{i} of *x*, we consider some time instant *t*_{i} at which *ẋ*_{i}(*t*) is zero. Under the same independence hypothesis as in the steady-state case, plus the non-degeneracy assumption that the second time derivative *ẍ*_{t}(*t̄ _{i}*) is not zero (so that we have a true local minimum or local maximum, but not an inflection point), we show here that the MRA approach applies in exactly the same manner as in the steady-state case. Specifically, the

*i*th row of the Jacobian of

*f*, evaluated at the vector (

*x̄*,

*p̄*), is recovered up to a constant multiple, where

*x̄*=

*x*(

*t̄*) is the full state

_{i}*x*at time

*t̄*. The main difference with the steady-state case is that different rows of

_{i}*f*are estimated at different pairs (

*x̄*,

*p̄*), since the considered times

*t̄*at which each individual

_{i}*ẋ*

_{i}(

*t*) vanishes are in general different for different indices

*i*, and so the state

*x̄*is different for different

*i*’s.

We fix an index *i*∈{1,…,*n*}, and an initial condition *x*(0), and assume that the solution *x*(*t*) with this initial condition and a given parameter vector *p̄* has the property that, for some time *t̄*=*t̄ _{i}*, we have that both

*ẋ*

_{i}(

*t*) = 0 and

*ẍ*

_{i}(

*t*) ≠ 0. At the instant

*t*=

*t̄*,

_{i}*x*

_{i}achieves a local minimum or a local maximum as a function of

*t*. We describe the reconstruction of the

*i*th row of the Jacobian of

*f*, which that is, the gradient ∇

*f*

_{i}, where

*f*

_{i}is the

*i*th coordinate of

*f*, evaluated at

*x*=

*x̄*and

*p*=

*p̄*, where

*x̄*=

*t̄*.

To emphasize the dependence of the solution on the parameters [the initial condition *x*(0) will remain fixed], we will denote the solution of the differential equation *ẋ*=*f*(*x̄*, *p̄*) by *x*(*t*, *p*). The function *x*(*t*, *p*) is jointly continuously differentiable in *x* and *p*, if the vector field *f* is continuously differentiable (see e.g. [12], Appendix C). Note that, with this notation, the left-hand side of the differential equation can also be written as ∂*x*/∂*t*, and that *x*(*t̄*, *p̄*)=*p̄*.

We introduce the following function:
Note that α(*t̄*, *p̄*)=0. Also,
The assumption that *ẍ*(*t̄*)≠0 when *p*=*p̄* means that . Therefore, we may apply the implicit function theorem and conclude the existence of a mapping τ, defined on a neighbourhood of *p̄*, with the property that
and φ(*p̄*)=*t̄* [and, in fact, *t*=τ(*p*) is the unique value of *t* near *t̄* such that (∂*x*_{i}/∂*t*)(*t*, *p*)=α(*t*, *p*)=0]. Finally, we define, also in a neighbourhood of *p̄*, the differentiable function
and note that φ(*p̄*)=*x̄*. Observe that, from the definition of α, we have that eqn (1) holds, exactly as in the steady-state case. From here, the reconstruction of ∇*f*_{i}(*x̄*, *p̄*) up to a constant multiple proceeds as in the steady-state case, again under the assumption that it is possible to apply *n*−1 independent parameter perturbations.

A noise analysis similar to that in the steady-state case can be done here. However, there are now many more potential sources of numerical and experimental error, since measurements at different times are involved. In addition, internal (thermal) noise may introduce additional error, since, in the quasi-steady-state case, the state probability distributions (solutions of the Chemical Master Equation) have not converged to steady state.

### Implementation: numerical approximation by finite differences

Of course, the sensitivities represented by the vectors *v*_{i} (entries of the matrix Σ, or Σ^{#} in the noisy case) cannot be directly obtained from typical experimental data. However, approximating the vectors *v*_{j} by finite differences, one has that ∇*f*_{i}(*x̄*, *p̄*) is approximately orthogonal to these differences as well. Explicitly, suppose that we approximate *v*_{j}=(∂(φ/∂*p*_{j})(*p̄*) by:
where *h* is small and where *e*_{j} is the vector having a one in the *j*th position and zeros elsewhere. Then, ∇*f*_{i}(*x̄*, *p̄*) is (approximately) orthogonal to the differences
which form a set of *n*−1 linearly independent vectors (if *h* is small). A simple matrix inversion (after fixing an arbitrary value for one of its entries) allows the computation of ∇*f*_{i}(*x̄*, *p̄*). Observe that division by the potentially small number *h* is not required in performing these operations, In fact, no knowledge whatsoever about parameter values is needed by the algorithm. In the quasi-steady-state case, note that φ(*p̄*+*he*_{j}) is the state *x*(*t*) at the time *t* at which *the particular coordinate x*_{i} achieves a local extremum value, if the parameters have been perturbed to *p*=*p̄*+*he _{j}*. To be more precise,

*t*is the unique time close to

*t̄*such that

*ẋ*

_{i}(

*t*)=0 when parameter vector

*p*is being used. Theoretically, we must have

*p*≈

*p̄*so

*h*must be very small, but, in practice, quite large perturbations of

*p*also work fine.

## A simple numerical example

We illustrate the calculations by working out a very simple example, the following system (writing *x* instead of *x*_{1} and *y* instead of *x*_{2}, and *p* and *q* instead of *p*_{1} and *p*_{2}):
We take the initial state (0, 0) and reference parameters *p̄*=2 and *q̄*=1. These equations might model, for example, a simplified dynamics of a two-gene network, in which the first gene enhances the expression of the second gene, which in turn represses the rate of expression of the first one, there are constitutive rates of production of each gene, and both protein products decay at rate 3 s^{−1}. (For simplicity, we do not introduce separate variables for mRNA and protein concentrations.) Parameter *p* represents a promoter strength, and we assume that there is a way to perturb it (perhaps by duplication or sequence change). Parameter *q* affects the degradation of the first gene product.

Figure 4 shows (solid lines) the solution [*x*(*t*), *y*(*t*)] with initial state (0, 0), *p*=*p̄*=2 and *q*=*q̄*=1. By time *t*=2.5, the system has settled approximately to steady state, (*x*(2.5), *y*(2.5)) ≈ (1.45, 1.3). At this point, the true Jacobian, to be estimated, is approximately:
In order to estimate the first row of the Jacobian, which does not depend on the parameter *p*, we next perturb to *p* = 4 (keeping *q* = 1; see Figure 4). We now have (*x*(2.5), *y*(2.5))≈(1.16, 1.88). Thus we are looking for a vector orthogonal to (1.16, 1.88) − (1.45, 1.3), which provides an estimated first row of approximately (−3, −1.5). (We normalized the first entry to −3 merely in order to compare our result with the true gradient; the algorithm does not know the value “−3”. In practice, one might know that the first entry of the vector is negative, reflecting degradation or dilution effects, so the algorithm will give the correct sign for the second term, as well as its magnitude relative to the rate of degradation or dilution.) Observe that we have retrieved the correct sign for the second element, and the relative error is less than 20%, even though we have used a 100% perturbation. To estimate the second row of the Jacobian, one perturbs *q*, and the results are even better. For example, a 50% perturbation to *q* = 1.5 (keeping *p* = 2; see Figure 4) gives (*x*(2.5), *y*(2.5)) ≈ (1.09, 1.05), which leads to perfect recovery (to two decimal digits) of the second row.

Next we illustrate the quasi-steady version of the algorithm, using the same example. We study the equation for the first variable, which is the one that has a peaking behaviour. The solid lines in Figure 5 (and also in Figures 6 and 7) again show plots of the solution coordinates *x*(*t*) and *y*(*t*), but now zooming-in on an initial interval where peaks occur.

Let us pose the following problem: not knowing the above equations, estimate the relative strength of the second gene’s effect on the rate of expression of the first one. The only data to be used are the levels of both gene products [*x*(*t*) and *y*(*t*)] at the time when *x*(*t*) achieves its local maximum. We do assume known the fact that the parameter *p* affects *directly* only the rate of expression of the second gene, not the first. Observe that the maximum of *x* is attained at *t* ≈ 0.5275, and the values there are (approximately) *x*(*t*) = 1.6553 and *y*(*t*) = 1.0138. The gradient ∇*f*_{1} of , evaluated at (1.407, 1.3695), has the true (but unknown to the algorithm) value:
Next, we perform the “experiment” in which *p* is up-perturbed by 25%. With the new parameter *p* = 2.5, we obtain plots as shown by the dotted lines in Figure 5. Now the maximum of *x* is attained at *t* ≈ 0.4268, and the values there are *x*(*t*) = 1.407 and *y*(*t*) = 1.3695. Letting δ = (1.407, 1.3695) − (1.6553, 1.0138), the unknown (to the algorithm) gradient ∇*f*_{1} is known to be (approximately) orthogonal to δ. Any vector perpendicular to δ must be a multiple of (−3, −2.3455). The relative error in our estimate is less than 5%.

Even larger perturbations may be performed. For example, a 50% perturbation from *p̄*=2 to *p*=3, provides the dashed lines in Figure 6. Now the maximum for *x* is attained at *t* ≈ 0.4658, and there *x*(*t*) = 1.5103 and *y*(*t*)=1.2073. The estimated gradient is now (−3, −2.2476), which gives a relative error of less than 9%. Finally, a 100% perturbation to *p*=4 provides the dashed lines in Figure 7. Now the maximum for *x* is attained at *t*≈0.4268, and there *x*(*t*)=1.4071 and *y*(*t*)=1.3695. The estimated gradient is now (−3, −2.0936), which gives a relative error of about 15%.

## Conclusions

The MRA method provides a tool for reconstructing network topology, and relative input strengths, from steady-state and quasi-steady-state data. Much future work remains to be done, especially concerning the analysis of noise.

## Summary

• Local interactions can be reverse engineered from global responses, so long as enough independent perturbations can be performed.

• The same method can be applied in a modular fashion.

• Measurement errors are dealt with by a total least squares procedure.

## Acknowledgments

This research was supported in part by NSF (National Science Foundation) grants DMS-0504557 and DMS-0614371, and an AFOSR (Air Force Office of Scientific Research) grant.

- © The Authors Journal compilation © 2008 Biochemical Society