Real-time in silico experiments on gene regulatory networks and surgery simulation on handheld devices

Simulation of all phenomena taking place in a surgical procedure is a formidable task that involves, when possible, the use of supercomputing facilities over long time periods. However, decision taking in the operating room needs for fast methods that provide an accurate response in real time. To this end, Model Order Reduction (MOR) techniques have emerged recently in the field of Computational Surgery to help alleviate this burden. In this paper, we review the basics of classical MOR and explain how a technique recently developed by the authors and coined as Proper Generalized Decomposition could make real-time feedback available with the use of simple devices like smartphones or tablets. Examples are given on the performance of the technique for problems at different scales of the surgical procedure, form gene regulatory networks to macroscopic soft tissue deformation and cutting.


Background
Some 15 years ago, Satava [1] proposed a taxonomy of virtual anatomy consisting of five different generations.The first generation is composed by systems representing accurately the geometry of the organs at a macroscopic level.The second generation would include an accurate description of the physical dynamics of the body.While it is still hard, more than a decade after, to find a real-time surgical simulator that incorporates accurate, state-of-the-art models for soft tissues at a continuum level, this taxonomy included three more generations.From the third to the fifth one, these virtual descriptions of the patient should include, respectively, accurate descriptions of physiology, microscopic anatomy (at a neurovascular level, for instance), and, finally, biochemical systems.
While many successful models exist for all these different levels of description, see for instance [2][3][4][5][6][7][8] among many others, they have not been yet fully incorporated into virtual reality simulators due to the impressive computing requirements that they involve.

Difficulties at a macroscopic level
These difficulties are of very different nature.If we talk, for instance, of systems devoted to train future surgeons' gestures such as cutting and suturing, the main difficulty comes from the highly non-linear constitutive equations of soft tissues, very often modeled as (possibly visco-) hyperelastic media [9].These non-linear equations must be solved under real-time constraints that reach 1 kHz of feedback response if we think of haptic devices, or 25 Hz if we need for visual feedback only.Currently, very few surgical training simulators at this level incorporate accurate models for tissue deformation.Among these, we can cite the works by Ourselin and Taylor [10] based on the use of explicit finite elements implemented on hardware (Graphic Processing Units, GPU).But in general, explicit algorithms lack of robustness for very long times of simulation.
Recently, a growing interest has been paid to investigate Model Order Reduction (MOR) techniques in this framework.MOR comprises a variety of techniques known under different names (Proper Orthogonal Decomposition, POD; Principal Component Analysis, PCA; Karhunen-Loeve transform; among others) and ubiquitous in almost every branch of applied sciences and engineering.After the pioneering works of Karhunen, Loeve and Lorenz [11][12][13], MOR techniques have been applied and re-discovered under different frameworks many times [14][15][16].
In essence, POD-based model order reduction is based (with the notable exception of [15,17]) on an a posteriori statistical treatment of existing solutions to complex problems that is used to construct an efficient (i.e., with very few degrees of freedom) basis to simulate problems slightly different to the original ones.While standard finite element techniques employ a basis of local, piece-wise polynomials to approximate the solution of a given problem (a very efficient choice when no information is at hand on the form of that solution), POD-based techniques employ global basis, specific for each particular problem.This basis is determined after constructing the correlation matrix of the results obtained by solving similar problems to the one at hand (the so-called snapshots of the system).These snapshots could be obtained, for instance, by simulating different points of contact between surgical tools and organ, as in [18,19].These snapshots allow to extract the basis to simulate, in a Galerkin framework, for instance, situations different to the original ones (a new point of contact, not considered initially, for instance).In [20], POD is employed to augment the range of stability of explicit finite element methods, which is another known property of the technique.
This approach presents, however, some major drawbacks.When dealing with highly non-linear tissues (which is very often the case), resulting equations must be consequently linearized.Employing standard Newton-Raphson schemes to solve these equations leads to the need of re-computing the tangent stiffness matrix of the system, which is a very time-consuming operation that eliminates many of the advantages of POD and renders the method useless.To overcome this, several techniques have been proposed.For instance, POD with interpolation [18], the so-called Empirical Interpolation Method [21] or its discrete counterpart [22] allow to overcome this difficulty.A different approach has been followed in some of our previous works [23,24], in which a Taylor series expansion is applied to the variables of interest (here, the displacement field) in order to obtain a sequence of problems, one for each order of the expansion, but all with the same tangent stiffness matrix.

Beating the curse of dimensionality
A completely different source of complexity arises in problems whose solution is defined in spaces of a high number of dimensions.For instance, in [25] and references therein, the very interesting problem of modeling and simulating vein graft is studied that combines the need for simulation not only at a macroscopic level but also at a gene regulatory network level.It is hypothesized that blood shearing forces modulates a specific gene regulatory network determining the adaptive response of the vein wall.
Simulating the behavior of gene regulatory networks is a formidable task for several reasons.At this level of description, only a few molecules (maybe dozens or hundreds) of each species involved in the regulation process is present, and this eliminates the possibility of considering the process as deterministic, as is done very often in most chemical applications.In this situation, the continuum approach itself is questioned, as justified clearly in the excellent review by Turner et al. [26] and references therein.Here, the concept of concentration of the species does not make sense [6,27].On the contrary, under some weak hypothesis (well-stirred mixture, fixed volume, and temperature), the system can be considered as Markovian and can be consequently modeled by the so-called Chemical Master Equation (CME), [28], which is in fact no more than a set of ordinary differential equations stating the conservation of the probability density function P in time: where P (z, t|z 0 , t 0 ) represents the probability of being at a state in which there are a number of molecules of each species stored in the vector z at time t when we started from a state z 0 at time t 0 .a j represents the propensity (i.e., the probability) of reaction j to occur, while v j represents the change in the number of molecules of each species if reaction j takes place.This change is given, of course, by the stoichiometry of the reaction at hand.What is challenging, however, in this set of equations is that they are defined in a state space which possess as many dimensions as the number of different species involved in the regulatory network.Under this framework, if we consider N different species, present at a number n of copies, the number of different possible states of the system is n N .This number can take the astronomical value of 10 6,000 if we consider some types of proteins, for instance, [28].This phenomenon is known as the curse of dimensionality in many branches of science.For instance, Nobel prize winner R. B. Laughlin said, when talking about this problem [29], that 'No computer existing, or that will ever exist, can break this barrier because it is a catastrophe of dimension'.
To overcome this difficulty, most of the authors employ Monte Carlo-like algorithms (the so-called stochastic simulation algorithm, SSA [28,30,31]).But Monte Carlo techniques need for as many as possible individual realizations of the problem that compromise its simple application in inverse identification, leading to excessive time-consuming simulations, together with great variance in the results.

Proper generalized decomposition at a glance
Dealing with the problem of the curse of dimensionality in a very different context, the authors presented in a previous work a technique that is now known under the name of Proper Generalized Decomposition (PGD) [32,33].Essentially, to avoid the exponentially growing complexity of the problem with the number of state space dimensions, the method approximates the variable of interest, say u, as a finite sum of separable functions: The reason for this particular choice motivated the method itself that is conceived as a greedy algorithm that computes one sum at a time and one product at a time, within a fixed point, alternating directions algorithm.This leads to a sequence of one-dimensional (low-dimensional, in general) problems, one for each function F i j that can be solved using your favorite technique (finite elements, finite volumes, finite differences, colocation, . ..).
If M nodes are used to discretize each coordinate, the total number of PGD unknowns is N × M × D instead of the M D degrees of freedom involved in standard mesh-based discretizations.Moreover, all numerical experiments carried out to date with the PGD show that the number of terms N required to obtain an accurate solution is not a function of the problem dimension D, but it rather depends on the regularity of the exact solution.The PGD thus avoids the exponential complexity with respect to the problem dimension.
The PGD technique can thus be seen as both a MOR technique, if we keep the number N of modes to a minimum and as an efficient weapon against the curse of dimensionality, since it proceeds by solving a sequence of one-dimensional problems of negligible computational cost.Note that by letting N grow, we will finally arrive at a solution of the same accuracy of the finite element one, for instance, once the number of terms in the basis, N is the same as the number of nodes of the finite element mesh.In many applications studied to date (see [34,35] and references therein), N is found to be as small as a few tens for usual symmetric differential operators, and the approximation converges towards the solution associated with the complete tensor product of the approximation bases considered in each spatial dimension (see [36] for a formal proof in the case of elliptic problems).
This was also the main motivation of the so-called radial loading approximation within the LArge Time Increment (LATIN) method by Ladeveze [37].It can be seen as a particular case of PGD approximation in which space-time separated representation is employed to solve non-linear structural mechanics problems.
On the other hand, PGD methods can be also seen as an efficient tool for highdimensional problems.This twofold characteristic of the method makes it specially appealing for the numerical solution of the type of problems mentioned in the 'Background' section.

Parametric problems as a tool for real-time simulation: an off-line/on-line strategy
As mentioned before, PGD can be seen both as a model order reduction technique and as an efficient solver for high-dimensional problems.But what actually interests us in the field of computational surgery is its ability to solve parametric problems in an unprecedented way.Indeed, in [38], a strategy was developed that sets out parametric problems in the form of high-dimensional ones, for which PGD is specially efficient.In words, if we seek for the solution of a parametric problem u (x, t, p 1 , . . ., p m ), the approach we follow is to consider the dependence of the solution on the parameters as if they were additional, non-physical coordinates where the solution takes place, just as new additional coordinates: and therefore look for a solution as a finite sum of separable functions of space, (possibly) time and parameters p 1 , . . ., p m .This possibility opens the door to establish a strategy based on two steps.First, a general, multi-dimensional solution to the parametric problem is computed off-line.This phase may lead to the need of supercomputing facilities, but the solution is computed once for life and can be efficiently stored in the form of a set of separated functions, as stated before.This phase thus leads to a sort of meta-model or response surface for the problem.It is noteworthy to mention that this response surface is obtained without the need of any prior computer experiment.It is computed on the fly and stored for life.It provides the solution to the problem for a combination of parameter values, taken from within a prescribed value interval.This response surface is efficiently stored as a file of nodal values for all the involved separated functions that are just multiplied in real time straightforwardly.
After this meta-model is obtained, a second phase of the method is executed on-line.In this phase, the meta-model is evaluated, not solved for, at very efficient feedback rates.Figure 1 sketches the basics of the developed method.This approach has reported to provide with feedback rates on the order of kilohertz running on a simple laptop.These results will be deeply analyzed in the following section.

Results and discussion
In order to show how the proposed methodology works, we consider here two distinct examples at two different scales.On one hand, we show how the technique works for the simulation of gene regulatory networks, even in the lack of knowledge about some parameters in the reactions.Secondly, we analyze how the multi-dimensional methodology proposed so far can be efficiently applied to the simulation of macro-scale problems such as liver palpation.

A PGD approach to gene regulatory network simulation
The PGD approach to the problem of efficiently simulating gene regulatory networks begins by assuming that the probability of being at a particular state z at time t can be approximated as a finite sum of separable functions, i.e.
where, as mentioned before, the variables z i represent the number of molecules of species i present at a given time instant.This particular choice of the form of the basis functions allows for an important reduction in the number of degrees of freedom of the problem, N × n nod × (D + 1) instead of (n nod ) D , where D is the number of dimensions of the state space and n nod the number of degrees of freedom of each one-dimensional grid established for each spatial dimension.For this to be useful, one has to assume that the probability is negligible outside some interval and therefore substitute the infinite domain by a subdomain [0, . . ., m − 1] D , m being the chosen limit number of molecules for any species in the simulation.A similar assumption is behind other methods in the literature, such as the Finite State Projection algorithm, for instance, [28].
Another important point to be highlighted is the presence of a function depending solely on time, F j t (t).This means that the algorithm is not incremental.Instead, it solves for the whole time history of the chemical species at each iteration of the method.If one then assumes that n terms of the sum given by Equation ( 4) are already known, and look for the n + 1-th term, by substituting Equation (5) into the CME, Eq. ( 1) gives a non-linear problem in F n+1 1 , . . ., F n+1 D , F n+1 t that is solved by means of a fixed point, alternating directions algorithm, see [39].
To show how this technique works, consider one of the simplest and most studied examples of gene regulatory networks, that of λ-phage.When a bacteriophage λ infects a cell, it either stays dormant or it reproduces until the cell dies.The resulting behavior depends crucially on two competing proteins that inhibit mutually each other, see a schematic representation in Figure 2. The so-called toggle switch is composed of a two-gene co-repressive network.For this case, the governing CME has the form [7] with A = A 1 + A 2 , two operators, one for each reaction in the system.The form of these operators, following [7] is and A 2 equivalent with z 1 and z 2 interchanged.We computed the solution for δ = 0.05, α = 1.0, γ = 1.0 and β = 0.4.The simulation started from a non-physiological state in which both proteins showed a very high probability around z 1 = z 2 = 15.Despite this initial state, after t = 100 s (Figure 3a), one has a case where both average values of both proteins and small levels of the one protein combined with higher level of the other protein are quite likely, and this remains the case for the stationary distribution as well [7], Figure 3b.
But what should be highlighted about this technique is not only its ability to solve gene regulatory network simulations in a reasonable amount of time, which is extended easily to problems with some 20 different species involved, see [39], for instance.Very often, there is an important lack of experimental data concerning constants of the reactions (propensities), or simply we want to adopt the meta-modeling strategy introduced before.In that case, it is very convenient to set up the problem in parametric form and to convert it in a multi-dimensional one.Considering parameters as new state space dimensions opens the door to designing in silico experiments in which one readily (in real time) observes the behavior of the system under different conditions.The transient solution for a particular value of the propensity can then be computed by restricting the general solution to each particular value of this extra-coordinate.Even if the dimensionality of the problems increases even more than that demanded by the CME itself, this does not constitute major difficulty for PGD techniques that have easily solved problems in dimension 100 and more [35].
To illustrate this feature, we have simulated, for the ease of exposition, a cascade of only two terms.The operator related to a cascade is of the form A = A 1 + A 2 , with A 1 of the same form of the previous example and operator A 2 takes the form where se 2 is the second standard basis of R 2 .In order to check the proposed technique, and for the ease of illustration, we have considered a cascade of only two terms, with the parameter δ as an unknown.Note that the solution (obtained in one execution of the program), see Figure 4, provides the solution for different values of δ that reproduces the ones in the literature [7].These examples run (off-line phase) on some minutes in a laptop, while they can be evaluated (on-line phase, parametric phase) at kilohertz rates with no special hardware requirements.These examples show how an efficient simulation of gene regulatory networks can be incorporated into surgical simulators even within the operating room.In the new section, we show how a similar parametric, multi-dimensional strategy can be developed for macroscopic descriptions of surgery.

Simulation of liver palpation
As a representative example of the performance of the proposed technique at a macroscopic, continuum level, we have chosen a classical example of liver palpation during hepatic endoscopic resections [3].For a detailed description of how resection could be simulated under MOR settings, we refer the interested reader to our former work [19].For the sake of simplicity, we focus on the simulation of the interaction of surgical tools and organ, without the presence of cuts.
The problem of determining the response of an organ to the load transmitted by the contact with a surgical tool could be formulated as to determine the displacement at any point of the model, u(x, y, z), for any load position s and for any force vector orientation and module, t, thus rendering a problem defined in the physical space (R 3 ), plus a six-dimensional state space (R 6 ).Following the previous developments, we propose an iterative scheme that works by finding the n + 1-th term of the separated approximation in the form: where we have assumed, for the sake of simplicity on the exposition, that the load is unitary and directed along the z axis (thus, no dependence on t is considered here).The term u j refers to the j-th component of the displacement vector, j = 1, 2, 3 and where R(x) and S(s) are the sought functions that improve the approximation.Again, this iterative scheme is solved by introducing approximation given by Equation ( 8) into the weak form of the problem.This renders a non-linear problem on R and S that is solved, in our implementation, by using a fixed point, alternating directions algorithm.At each direction of the fixed point algorithm, we face again a non-linear problem, due to the non-linear constitutive equations of the liver tissue.In [40,41], two distinct approaches have been pursued, namely an explicit one and a combination of PGD and asymptotic expansions on the variables of interest.The interested reader is committed to read these references for more details on the implementation.
Although the literature on the mechanical properties of the liver is not very detailed, we have assumed a Kirchhoff-Saint Venant material with Young's modulus of 160 kPa, and a Poisson coefficient of 0.48, thus nearly incompressible [2].Model's solution was composed by a total of N = 167 functional pairs X k j (x) • Y k j (s) (see Equation ( 8)).The third component (thus j = 3) of the first six modes X k 3 (x) is depicted in Figure 5.The same is done in Figure 6 for functions Y, although in this case they are defined only on the boundary of the domain.

Performance of the technique
Both problems introduced before can be solved off-line in standard computing facilities in reasonable amounts of time.In our case, the solution to the problem of liver palpation, for instance, was solved in a workstation equipped with two Nehalem cores at 2.33 Ghz, 24 Gb RAM and 64 bits.The simulations took some 20 h to complete.The solution provided by the method agrees well with reference FE solutions obtained employing full-Newton-Raphson iterative schemes.But, notably, the computed solution can be stored in a so compact form that the on-line evaluation of the parametric solution (meta-model) is possible on handheld devices such as smartphones and tablets.For instance, for Android-operated devices, an application has been developed (we call it iPGD and is freely downloadable from [42]) that runs the model on a Motorola Xoom tablet running Android 3.0 without problems (only the surface of the model is represented for simplicity, given the limitations of the Android OS), see Figure 7.The 25-Hz feedback rate necessary for continuous visual perception is achieved without problems.
For more sophisticated requirements, such as those dictated by haptic peripherals, a simple laptop (in our case a MacBook pro running MAC OSX 10.7.4,equipped with 4-Gb RAM and an Intel core i7 processor at 2.66 GHz) is enough to achieve this performance.Even performances higher than 500 Hz have been reported for some implementations [43] Additional file 1.

Conclusions
In this paper, we have reviewed a new methodology for Model Order Reduction in the context of computational surgery proposed by the authors in a series of previous papers.This new methodology, coined as Proper Generalized Decomposition, improves existing techniques in various ways.Firstly, it enables to incorporate state-of-the-art constitutive models for soft tissues in systems requiring real-time performance (even reaching feedback rates in the order of 1 kHz).This performance can be achieved by employing an off-line/on-line strategy in which a multi-dimensional surface response or meta-model is computed, without the need of previous computer experiments.This meta-model is then evaluated or particularized at real-time rates very efficiently.This is possible due to the special form of the approximation of the solution, in the form of a finite sum of separable functions.Thus, this meta-model is stored in memory as a file containing a series of vector, with great savings of memory.
Another fundamental issue regarding this method is that it solves very efficiently highdimensional problems.Gene regulatory networks modeled in a stochastic differential equation framework are a paradigm of such a high-dimensional problem.Parameters, such as unknown properties of the system, could also be considered advantageously as new state space dimensions of the problem, thus rendering an ever higher-dimensional problem, but that is still solved efficiently by the proposed technique.
The result is an appealing technique that allows to solve at unprecedented feedback rates state-of-the-art models for multiscale computational surgery.After an academic validation, our current effort of research is directed towards clinical validation of this approach.

Figure 2 Figure 3
Figure 2 Schematic mechanism of the toggle switch.Schematic mechanism of the toggle switch [6].The constitutive P L promoter drives the expression of the lacI gene, which produces the lac repressor tetramer.The lac repressor tetramer binds the lac operator sites adjacent to the Ptrc − 2 promoter, thereby blocking transcription of cI.The constitutive Ptrc − 2 promoter drives the expression of the cI gene, which produces the λ-repressor dimer.The λ-repressor dimer cooperatively binds to the operator sites native to the P L promoter, which prevents transcription of lacI.

Figure 4
Figure 4 Solution of the cascade problem with unknown parameter δ.Solution for the cascade problem with unknown propensities.Probability distribution function (top row) and marginal probability distribution function of each species (bottom row) at time t = 0 s, t = 30 s, and t = 600 s (approximately steady state).The left column presents the results for a value δ = 0.01, while the central one is for δ = 0.025 and the left one for δ = 0.045.Note that all the results are obtained in one execution of the program.The four-dimensional hypercube containing the solution space, whose dimensions are the concentrations of the two proteins, the value of δ and time, is then cut by the hyperplanes defined by the different values of δ and time to give these plots.

Figure 5
Figure 5 Spatial modes of the liver solution.Six first functions X k 3 (x), k = 1, . . .6, for the simulation of the liver.

Figure 6
Figure 6 Load-dependent modes of the liver solution.Six first functions Y k 3 (s), k = 1, . . .6, for the simulation of the liver.Note that, in this case, functions Y k (s) are defined on the boundary of the liver only.

Figure 7
Figure 7 Appearance of the proposed method running in a tablet under Android.An example of the implementation of the iPGD application for the liver problem in a Motorola Xoom tablet.