Introduction
I had made some attempts to analyse the dynamics of co-existence in some ODE models of imaginary autocatalysts here. A first attempt at a generative metabolism was made here, and submitted as a short paper to AlifeX. The former approach is beset by the fact that the range of randomly generated autocatalytic chemical network ecologies is vast, and a case by case dynamical systems analysis would be exceedingly time-consuming. Also, it says nothing about the structural stability of chemical organizations, e.g their resiliance to invaders, and to internally generated cheaters. A more tolerable approach may be to attempt to identify the stability of various chemical ecological motifs within the framework of a generative chemistry community assembly type model. An even more tolerable approach would be to see whether a chemical generative system can undertake MEP (maximimization of entropy production) effectively, or in some sense self-organize to "improve its conditions for existence". Notice how this is related to the Gaia Hypothesis. An excellent bibliography is available from James Dyke's website at Sussex.
If we can demonstrate MEP or some Gaian property [such as stability to a range of external forcers, total extent of re-cycling matter, stability to random culls], in a chemical system phenomologically, we need know nothing at first about the dynamics that actually allows this to happen. But whatever they are, we will have discovered that these dynamics must be interesting, and then we can begin to examine the details of the chemical organization. Thus, I hope to detect interesting chemical organizations in my model as Lovelock intended to detect life on other planets, by finding disequilibrium of chemical species, e.g. Mars and Venus show equilibrium; CO2 being very common, whereas Earth shows massive disequilibrium; O2 and CH4 existing at orders of magnitude higher concentration than would be expected. Not only do I hope to find the disequilibrium, but perhaps self-regulation of the chemical system against external forcing, just as for example in the maintenance of O2 concentration, and temperature by the biosphere.
Below I will review two seperate research programs that I hope will inspire my further experiments in chemical evolution.
1 Research on Gaia Theory and the Maximization of Entropy Production.
2. Ecological Community Iteration Models.
- Community iteration models of food-webs. E.g. (Pawer et al 2005)
Gaia Homeostasis/Optimization and its Mechanisms?
Gaia theory is a loosely connected set the hypotheses trying to account for the fact that the biosphere recursively generates functional constraints that benefit itself, such that its chances of survival are increased in some sense, or that its extent is increased in some sense. One such explanandum is the homeostasis of earth temperature despite the slowly increasing solor luminosity. In trying to formalize this hypotheiss Kleidon (2002) argues here, that "life has a strong tendency to affect Earth in a way which enhances the overall benefit (defined as carbon uptake)." He claims that life is associated with increased gross primary production which is defined as the "long-term mean global gross uptake of carbon by the biota"? Evidence that it is so associated is that moisture circulation and hence GPP is enhanced by vegetation, in simulation. and that Life increases the rate of phospherous recycling (cycling ratio, Volk 1998) by 48 times compared to the rate achieved by abiotic rock weathering, and hence allows higher GPP. The former cannot easily be explained by natural selection since the trait (ability to increase moisture circulation) confers an advantage to those plants that posses the trait, and to those that do not. Selection of an trait occurs only if it confers a differential fitness beenfit to that organism which posesses it (Kirchner, 2002)). The latter may however be explained by natural selection at the individual level, since the cause of global GPP increasing could be due to the fact the entities that are better able to undergo natural selection are those that increase GPP. However, this does not rule out that there is some other mechanism tending to increase GPP. Such a mechanism may be particualrly 'useful' before natural selection, or even to explain the origin of natural selection.
Some versions of Gaia do require natural selection of lower level units, others claim to require only random generation of lower level units under some plausible constraints. We consider some various proposals below.
Daisyworld was a demonstration of a 'mechanism' for explaining the 'Homeostatic Gaia' hypotheiss. So I will begin by modeling Daisyworld, and discussing the objections to its validity as an explanation of how biosphere self-regulation could arise and exist, e.g. that it may assume unreasonable correspondences between features that benefit the organism and features than benefit the biosphere as a whole ( Kirchner, 2002). What features of daisyworld are essential for homeostasis of a biospheric variable and which of these are stable in the light of underlying natural selection of the entities responsible for control?
Gaia Homeostasis by Rein Control
1. The simplest working daisyworld model shows that homeostasis of a biospheric variable can be by Rein control. Inman Harvey (2005) has argued that 'rein control' can account for self-regulation in Daisyworld. Inman proposed that it is sufficient to have a random collection of entities whose response functions (i.e. the extent of the effect on an environmental variable which "we" wish to control within a range) are hat functions of the environmental variable itself, and these hat functions are distributed with their maxima along the range along which we wish to control that environmental variable. (It is sufficient for the hat functions of black and white daisies to be overlapping, i.e. they can have the same preferred temperature and Inman's mechanism still works). The direction of the effect on the environmental variable can be randomly assigned. Some entities may heat the environment up, some may cool it down; in the context of the daisyworld model this means you can randomly assign black and white colors (albido's) to daisies; as long as at least one response function exists that shifts the control variable in the opposite direction to the direction of perturbation, as the extent of that response increases, i.e. entities must be able to pull the control variable back towards its original value as it increases the extent of response. Organism growth rates can be considered hat functions of various environmental variables and these hat functions will be distributed along a range. Without proposing natural selection at the level of the biosphere, i.e. assuming only a random set of effects of each organism on the environmental variable itself, the above mechanism might explain regulation of that environmental variable. But if you add in natural selection, things get more complex; then entities that are active (i.e. have steady state high responses) are more likely to produce new species, and hence to shift the viable range towards the status-environmental-quo, and perhaps to withdraw from mediating self-regulation.
Even if the Gaia mechanism boils down to 'merely' a mechanism that explains the origin and stability of cooperation amongst replicators undergoing natural selection, it cannot appeal to the many group selection models that allow such cooperation to evolve due to between group level selection, since the Earth cannot experience such selection. Without all the benefits of group selection, the Gaian mechanism must be stable against all the same problems that the evolution of cooperation faces, in systems where group selection is possible. This is a tall order, and what makes the possibility of an alternative self-organizing mechanism for order so appealing.
Ignoring stability against natural selection for the moment, Inman's mechanism appears to explain some homeostasis in gaia without the need for natural selection to create the mechanism in the first place. This is perhaps half a step in the right direction, and is tailoured to some past concerns.
[How might entropy come in here? In reality it is species (growth) that is best related to entropy production, then this system has the effect of maximizing the rate of entropy production by the system, since it maximizes the extent to which organism response functions are active over a range of perturbations. However, the rein control mechanism does not explain the means of creation of this entropy maximizing system. It still requires natural selection or some designer to have created these organisms with hat functions in the first place, even if the creator did not care about the direction of response. So, some selection or design or further self-organization is required at the lower level, to construct entities that grow at some optimum, and have some sort of effect. These low level entities, the components of the rein control explanation, require energy. I seek an understanding of the means of origination of these lower level entities. Can rein control arise spontaneously? How can hat functions arise in chemical systems? How in chemical systems can the loose coupling (in both directions of influence) to a control variable arise, and can such processes explain the movement of a chemical system away from equilibrium, and towards self-regulation? ]
I have modeled Inman's Daisyworld here, and I will use this model as a basis to assess the validity of other claims about Daisyworld.
Inman's Assumptions.
i. There are two types of replicators (daisies), black and white, not in competition for resources.
ii. The steady state population denisty (Db and Dw, [range 0,1]) of daisies depends on their local temperature (Tb and Tw), as a hat function (piecewise linear). It is reasonably assumed that the daisies reach this steady state value very rapidly for any fixed bed temperature.
iii. Black daisies have low albedo, i.e. heat themselves up in proportion to their abundence, and thereby heat the planet up. Vice versa for white daisies.
iv. The temperature of each bed, Tb and Tw, is influenced by i. external forcing of temperature, ii. daisy response that modifies external forcing either reducing or increasing its extent, iii. leakage resulting in equilibration of temperature between daisy beds. Let S be the temperature of the sun, T the temperature of one of the beds, and α, the albedo of the bed. Then the heat flow from the sun to the bed is given by (1 - α )(S - T), i.e. if albedo is 1, there is complete reflectivity and no heat is absorbed. Heat is also dissipated from the bed at rate T into space. The daisies cover a proportion of the bed D determined by the hat function H(T), [D = H(T)] and have an influence on the temperature that is linear to their proportion, modified by a constant u, (+ve for black daisies, and -'ve for white daisies). The resulting equation for dT/dt is..

T does not change (is at steady state) when the following condition is met..
![]()
The number of daisies D must by definition be given by the hat function as a function of temperature H(T), and so we can plot this on a graph of D against T. This gives the nullcline if you like for D, it is the value at which D does not change for a given value of T. Now we want to have a plot of the nullcline for T, which is given by the above equation. These are the values of steady state D and T where T does not change. Where these null-clines cross then there is no change in either D or T and these are the fixed points. We see that there is an imperfect pitchfork bifurcation as S is increased. with the attractor at zero D splitting into two attractors when the line meets the hat at a tangent (cusp point) and then moves through the hat. Let us look at the stability of these attractors.

The attractor at r is stable, whilst the attractor at p is also stable. The attractor in the middle is unstable.The next page considers the case where u is small compared to the initial zone of viability.

As pointed out by Jon Rowe, there is a minumum value of u, relative to the initial zone of viability, below which there is no benefit from feedback, i.e below which the zone of viability is not increased (but not decreased either). If this rein control is to have efficacy in real settings, then u must be greater than R(2-α)/2 . For u > R(2-α)/2 you can see that the zone of viability is increased. The species can survive at external forcing S less than would be normally endurable, (remember u is +'ve here, we are considering black daisies that heat up the world), since daisies heat up the world.
Inman then simulated the system with two daisybeds, using the techinques described below. Thanks to Inman for providing me with his code to play with. There are some apparent differences between the above model and the simulated model. First, there is no explicit u, and no leakage into space. This introduces some differences.
Experiments
1. Control Using only One Kind of Daisy.
Consider the control case for just one daisy bed. We do this by switching the albido of the black daisies to 0.5, the same as the bare earth upon which they rest. We expect there to be homeostasis only in one direction (mediated be white daisy rein control). For the parameters, Preferred Temp (of both B and W daisies!) = 100.0, Width of Witches Hat = 10.0, Surface Albido = 0.5, Black Albido = 0.5, White Albido = 0, Delta = 0.01, L = 0.0, Db_initial = Dw_initial = 0.05. We model two quite isolated beds; the black bed having no capacity for species growth to influence the environmental variable, T, whatsoever. These are two completely independent systems.

Several points of note:
1. Black daisies (proportion * 100 shown in blue) have a range of viability over 40 units of S, whereas the white daisies have a slightly larger range of viability at higher temperatures, since its bed reduces its bed temperature by decreasing light absorption as temperature increases.
2. The black daisies exert no control over the temperature of their bed, which increases linearly. The white bed on the other hand is controlled at a temperature less than the optimal for white daisy growth. This is because at lower temperatures white daisies actually detrimentally effect their capacity to grow by preventing their bed from heating up. It just so happens that this disadvantage becomes an advantage at the other extreme of temperature.
3. The black daisies on the other hand can grow to their full potential, i.e. to even occupy the whole bed when the bed temperature equals their preferred temperature. Why don't the white daisies ever manage to bring their bed temperature to their optimal temperature so that they can also obtain Dw = 1? It is only when the initial condition of W is above the fixed point q, that fixed point r can be reached, otherwise fixed point p is reached. In-fact due to rein control there are two basins of attraction seperated by the unstable fixed point that is created at the imperfect pitchfork bifurcation occuring at the point of intersection with the hat function's peak. This means that the steady state value of W depends on initial conditions. This is best visualized by considering not only the steady state resulting from merely one initial value of W, but by plotting the steady state arising from many different initial parameters. One way to vizualize this is to record values of [W*] for many different starting conditions at each value of S. This is shown below.

Superimposed on the above graph is a plot of the steady states of [W*] obtained for different starting conditions from W = 0, to W = 1. When W is capable of starting > r, that it is capable of having an infinite zone of viability, as long as temperature is above the a minimum value, since complete control of any temperature increasing perturbation is achievable because W can always increase! It more realistic to assume lower Dw and Db as initial conditions, and less extreme albido's.
2. Two Daisy Beds with Maximum Conductance.
When there is maximum leakage resulting in perfect equilibrium of environement between the daisies, Inman found no homeostasis. Using the parameters, Preferred Temp (of both B and W daisies!) = 100.0, Width of Witches Hat = 10.0, Surface Albido = 0.5, Black Albido = 0, White Albido = 0, Delta = 0.01, L = 1.0, Db_initial = Dw_initial = 0.05, you can see this below.

With maximum leakage, even though the species can alter the aldebo of their beds, this does not result in an increased viable range. This is because since both species have the same hat function, the net albedo must be 0.5.
3. Two Daisy Beds with Maximum Conductance.
When there is no leakage, i.e. completely indpendent daisybeds, there is a linear sum of the behaviours observed in the single bed. Using the parameters, Preferred Temp (of both B and W daisies!) = 100.0, Width of Witches Hat = 10.0, Surface Albido = 0.5, Black Albido = 0, White Albido = 0, Delta = 0.01, L = 0.0, Db_initial = Dw_initial = 0.05, you can see this below.

Note that the black daisies actually have a slightly higher initial peak!? Why is this? It is useful to consider a larger initial starting concentration of daisies, since this means that the steady state values of positive daisy concentration are possible over a wider range of luminosities (S). See below for the same experiment as above but with initial conditions, Dw = 0.5 and Db = 0.5. There the asymmetry is even more pronounced! What accounts for this asymmetry?

4. Two Daisy Beds with Intermediate Conductance.

With L = 0.5,and the above initial conditions we see that the zone of each daisies viability has actually been reduced, but that the zone of co-existence has been increased (see central region stretches from 160-240 S, whereas previously it stretched from only 180- 220 S.
5. Two Daisy Beds with Intermediate Conductance and different preferred temperatures.
Next, Inman experimented with different preferred temperatures for white and black daisies. I consider a range of these cases here.
i. Top Left: White prefers cold temperatures and Black prefers hot temperatures. There is no co-existence of daisies. Given a 80 degree defference in temperature preference. Each daisy controls its own beds temperature to some degree, and also influences the other bed temperature due to 0.5 leakage. So, the white daisy that prefers cold temperatures actually cools its own bed down, but also cools the bed of the other species down. However, it does not have a harmful (or any) effect on the other species since it has died out by the time the black daisy could exist.
ii. Rop Right: White still prefers cold temperatures and Black still prefers hot temperatures, but this time their preferred temperatures are closer together, differing by 40. White daisies start to grow earlier, and reduce the temperature of both beds. At a certain S, black daisies can exist, and this results in a sudden increase in the concentration of both daisies. Co-existence is maintained for a range that is maximum when the preferred temperatures are equal. (Bottom Left).

Bottom Right: If we now consider the pathological case where white daisies prefer warm temperatures but cool the beds, and black daisies prefer cool temperatures but warm the beds. Even given the same difference in preferred temperatures as Top Right there is no co-existence because black daisies have already died out at the temperature at which white daisies can have positiv estable states. The explanation for this is as follows. Previously, the white daisies were managing to keep themseles alive at higher S, due to the negative feedback their growth had on their local temperature. This meant that they were able to remain in existence until black daisies could appear. However, in the bottom right, since black daisies have a detrimental effect on themselves at higher S, they died out even before white daisies could appear.
6. Two Daisy Beds with Intermediate Conductance But Only One Species (WHITE) capable of Influencing Temperature.
I wanted to know, given intermediate leakage whether co-existence of both species could be enhanced given only one rein of the rein control mechanism. It turns out it can. The top graph below shows the case where only white daisies have albido = 1.0, but black daisies have albido 0.5, i.e. have no influence on any other variable. You can see that compared to the bottom graph in which both species have albido 0.5, that rein control by white daisies actually increases the zone of viability of the black daisies, at higher S, by reducing the temperature of the black daisy bed too.

In fact the graph below is I think quite revealing. With L = 0.9, and only the white species having albido 1.0, (black albido = 0.5), the range of co-existence is increased even more than when black daisies have any albido at all! When L = 1.0, the species have overlapping populations, with black daisies being able to life just as well as white daisies do. In the L = 0.9 case, the white daisies do all the work, but the black daisies reep all the benefit.


The above graph shows different L values for one active and one passive species, compared against the behaviour of two active species with L = 0.5.
The above graph suggests that if species share the same preferred range of growth, that if just one species influences the variable to keep it within that range of growth, and this regulated variable is transmitted to the passive species, then co-existence can be over a wider range than in the dual-rein control case in which species have opposing influences on the environmental variable. At high conductance, the passive species does much better than the non-passive species at all but the highest temperatures, i.e. just near where the system is about to fail completely.
Natural Selection and Daisyworld
One of the criticisms of Daisyworld comes from Kirchner, 2002.
"When Daisyworld is cool, the black daisies are not favored by natural selection because they warm the environment. Instead, they are favored by natural selection because they warm themselves, and therefore thrive better in cool temperatures than the white daisies do. To the extent that they also warm the environment, and therefore make it better for white daisies and worse for themselves, they diminish (not enhance) their evolutionary advantage. In competition with white daisies, black daisies would have a greater evolutionary advantage if they had no effect on their environment at all, because the planet would remain cooler and thus more favorable for black daisies. The black daisies could increase their evolutionary advantage still further if, while warming themselves, the cooled the environment, thus suppressing competition from the white daisies. If the black daisies only warmed the environment, and thus warmed themselves and the white daisies equally, they would never be favored by natural selection. Thus the Daisyworld model behaves the way it does specifically because it assumes that a given trait (here, daisy color) will affect the individual and its environment in the same way (Lenton, 1998), with a bigger benefit to the individual than to the environment as a whole."
These sorts of objections have been considered extensively by Lenton and Lovelock here. Natural Selection in Daisyworld : Tim Lenton (1999) is keen to stress the necessity of stability against natural selection at the organism level to ground an account for planetary self-regulation, writing "it seems that natural selection can form an integral part of planetary self-regulation and, where destabilizing effects arise, they may be less likely than stabilizing effects to attain global significance or persist.". Lenton for example has an experiment where he starts the would off with grey daisies and finds the evolution of daisies with, different albedos, (increasing reflectivity) as temperature increases.
These are interesting dynamics, but not of direct importance to our quest for they all depend on the existence of natural selection in the first place. I am interested in processes that can 'spark' the origin of life in the absence of natural selection. Self-organizing processes that can explain the origin of entities that are at disequilibrium, and eventually capable of natural selection. The daisyworld paradigm is really about Homeostatic Gaia. A fascination with how temperature regulation is possible given cheaters will only entangle us in already pregnent ecological territory. What needs to be done is to update the Daisyworld paradigm with a model that takes seriously the fact that change requires energy, i.e. that to have an entity capable of mutation, one requires energy, and the capacity to robustly obtain energy over long periods of time relative to ones capacity for change.
The rein control explanation still requires entities with hat functions, and assumes evolution created such entities. Something more seems to be required for a self-organizing principle that can be of help for the origin of life, and chemical evolution. Also, there are many organizations of chance feedbacks that can result in homeostasis of something, but what we seek is a generative mechanism that can result in the origin of something so complex as natural selection.
Maximization of Entropy Production
The application of entropy (and energy) to explaining life has a long history (Boltzmann, 1986) (Schrodinger, 1944) which should be closely examined. Kleidon makes an excellent review (Kleidon, 2004). For example Lotka (1922), (Lotka (1922,2) saw fit to give natural selection the causal role of maximizing energy flux through the system, "natural selection tends to make the energy flux through the system a maximum, so far as compatible with the constraints to which the system is subject.", "namely so as to increase the total mass of the system, the rate of circulation of mass through the system, and the total energy flux through the system." Lotka rightly called his first sentence into question, "In particular, it remains to be established just what is the significance of the phrase "compatible with the constraints" which, in the presentation here given, modifies the maximum principle enunciated." As yet, I need the same question answered w.r.t. MEP.More recently, ecosystems have been thoroughly considered from an optimizing thermodynamic perspective, for example the Maximum Power Principle (Odum, 1969, 1988. Odum and Odum, 1981) states that " During self organization, system designs develop and prevail that maximize power intake, energy transformation, and those uses that reinforce production and efficiency." (H.T. Odum, 1995). More recently, considerable work is being done on MEP, e.g. Lorenz (2002), Lesins (1991). Schwartzman, (1999), Scheider and Kay (1994). Ulanowicz and Hannon (1987).The idea is that systems with sufficient degrees of freedom re-configure themselves so as to maximize the rate at which they produce entropy (Dewer, 2003). An example is the convective atmosphere of the earth which maximizes the rate of entropy production during conduction of heat from the equator to the poles. How can we model this in a computational system, specifically in a model of generative chemistry. How can we formalize the notion of entropy in a computer model? It is obviously necessary to define entropy in statistical terms.
What is Entropy?
A good introduction and links to sites is available at Wikipedia. The classical thermodynamic definition of entropy (S) refers to the quantity of (unusable) heat produced at a given temperature. So, if a system produces SxT (where T = the temperature of the surroundings) then this is the quantity of unusable heat produced. It is a measure of the amount of energy in a physical system at a specified temperature that cannot be used to do thermodynamic work (Wikipedia definition), or in other words the amount of unusable energy, or the extent of dispersal (delocalization) of energy in the system.
What is energy? Energy is the capacity to do work. What does this mean?
This is not a notion that is very pleasing to the computer scientist, since how is one to model unusable dispersed energy, and heat, in an abstract model of generative chemistry? The statistical thermodynamic notion of entropy at first glance seems as though it will be more useful. The entropy of a macrostate is related to the number of microstates that make up the macrostate. "The statistical mechanical entropy is then proportional to the minimum number of bits of information required to determine the microstate, given that you know the macrostate." (See maxEnt Entropy page). This relates to Boltzman's notion that " entropy is proportional to the logarithm of the number of microstates an ideal gas could occupy."
1. Jaynes forumation of Information theory.
Ecological Community Iteration Models.
Community assembly models are analogous to my chemical evolution models. They differ in some respects, e.g. a finite pool of invaders is assumed, often there is an explicit notion of stability based on local asymototic stable states, or permenance. Also, rules exist that constrain the types of interaction (LV coefficient distributions) that can occur between species, for example size constraints or time constraints are applied to determine what species can predate upon another species. What sorts of observations are made in such experiments?
CHEMICAL NETWORK PROPERTIES.
Dynamic Properties.
1. Abundance-frequency distributions (see Chemicals.data) (From Pawer et al) i. Shannon entropy measure (Equation 1, see shannonEntropy.data). ii. Skew of the distribution, e.g. positive if most species are rare (Equation 2, see skew.data).
2. Kinetic and Thermodynamic distributions.
i). Distribution of kf values and heats released in reactions, (see kfDistribution.data, heatDistribution.data).
ii). Distribution of free energies of species over evolutionary time. (see freeEnergyDistribution.data).
iii). Interaction strength distribution. An interaction strength for a species I define as the sum of propensities influencing that species at equilibrium. For each product, I calculate the chemical force pushing matter into that product at equilibrium, i.e. the propensities of all reactions producing that product. (see interactionStrength.data).
iv). Mean interaction strength, (Equation 3 from Pawer), is the mean of this value. (see meanInteractionStrength.data).
v). (Storage). Total Internal Free Energy of the system, i.e. the sum of species free energy X species abundence over all species in the system. (4th column of heat.data).
vi). Energy flux. i. The rate of light being absorbed by the network. Since each light absorbing reaction absorbs the same amount of energy in my model, this value is just the sum of the steady state concentrations of all light absorbing species. (3rd column of heat.data).
Structural Properties.
4. Degree distribution of network, i.e.
i). The number of species that a species reacts with (catagorized seperaely for light absorbing and heat producing reactions.). i.i average node degree ( <k> = 2 x M/n (M = max possible connections and n = actual number of connections)) , i.i.i) Average length of shortest path between two nodes <L>. i.ii). the skewness , i.iii). cumulative degree distribution. Does the degree distribution follow a power law as in scale -free networks?
Compare these results with random graphs, that have Average path length <L> and clustering coefficient <C> as below, (From Benko, 2002)

where <k> = 2 M/n. Are our networks generally sparce? How to L and C relate to the values of L and C for random graphs? How does L and C vary with chemical network size?
ii). Connectance (<k>) : Number of connections divided by max number of possible connections. (Equation 4 from Pawer).
iii). Geodesic paths and average path length: Gij is the shortest path between node I and J, The average of these is in equation 5 (from Pawar).
iv). Clustering coefficient: The overrepresentation of triangles in the network. (equation 6, shows the CC for the ith node, again from Pawer). This measures the overall cliquishness of the graph.
Specific Methods for Analysis of Chemical Networks.
A substrate graph can be defined in which the chemical species are vertices and edges connect two species x and y if they take part in the same reaction. Such graphs have been studied by Wagner and Fell (2001) and Gil Benko (2002). What types of graph property are associated with increased light absorption in our system over time?
Results From Analysis of Chemical Networks Generated Here
1. Analysis of the Generative Chemistry with Energy Proportional Reaction Generation and Cyclic Reaction Forcing.
The above measures are now applied to analyset the chemical network resulting from the generative process devised here. I consider first the system in which each new reaction must produce at least one product that already exists, and where the probability of a reaction is in proportion to the total amount of internal energy present in the mass that occupies the configuration which is that chemical species.
The chemistry is started with only one light absorbing and one heat producing reaction, and new reactions are generated from this one. We show the begining of the process in a 2D embedding of the substrate graph below.

1. (Above). Molecule reacts with itself in a ligation reaction that absorbs light. Then aa reacts with a in a heat producing ligation that produces aaa. The system is initially not recycling and so in the absence of further reactions arising, would run to an equilibrium with no further light absorption. However a new reaction is introduced.

2. (Above). 4 more new reactions arise, making longer molecules, and allowing re-cycling.
3. After a while it is not possible to obtain any insight from the 2D embedded network graphs based on out-degree centrality. At this stage network statistics must be used to gain insight into the network structure.
[DO NETWORK STATISTICS ON THIS NETWORK.]
Functional Analysis
1. Abundance-frequency distributions (see Chemicals.data) (From Pawer et al) i. Shannon entropy measure (Equation 1, see shannonEntropy.data). The Shannon entropy of the network over time is shown below.

It increases, from 0.2 to 5.0, with a noticable dip with a bottom at the 600th reaction event. Also, it seems to plateau out after 700 reactions, but we would need longer to believe this would be a stable condition!
ii. Skew of the distribution, e.g. positive if most species are rare (Equation 2, see skew.data). This is shown below...

The skew tends towards a small value compared to its initial condition. What does this mean?
i). Distribution of kf values and heats released in reactions, (see kfDistribution.data, heatDistribution.data). The distribution of kf values over time is shown below (left). Most of the forward rates are small. The histograms show the distribution after apx 800 reaction generation events.


The distribution of heats over time is shown above on the right. Most reactions release a small amount of heat, and a few release a lot of heat. This is the difference in free energy between the products and reactants. So technically, this is not actually heat, but includes heat and entropy terms.
ii). Distribution of free energies of species over evolutionary time. (see freeEnergyDistribution.data). This is shown below for the finel state of the system. Most species have low free energy but a few have high free energies.

It would be interesting to know how the free energy of a species related to its steady state concentration at the final state. In particular do we see any outliers?

The final resting concentration of a species is generally lower if its free energy is higher, although there are some outliers it seems (red arrow).
iii). Interaction strength distribution. An interaction strength for a species I define as the sum of propensities influencing that species at equilibrium. For each product, I calculate the chemical force pushing matter into that product at equilibrium, i.e. the propensities of all reactions producing that product. (see interactionStrength.data).

At the end of the run, most of the inflow rates are very low, with a small tail of higher flows. Again, let us compare the inflow rates against the resting concentration of species at the steady state at the end of the run.

There are a few species at the end of the run that have high inflow rates and maintain themselves at high concentrations.
iv). Mean interaction strength, (Equation 3 from Pawer), is the mean of this value. (see meanInteractionStrength.data). This is shown below for the run.

The mean in-flow per species is initially very variable, but this settles down to a low value after about 700 reactions have been added.
v). (Storage). Total Internal Free Energy of the system, i.e. the sum of species free energy X species abundence over all species in the system. (4th column of heat.data). As you can see below..;

The total internal energy contained in chemicals decreases over time. This is because the system decays into stable low energy chemicals that are not recycled into light absorbing chemicals. However, note that decay appears to cease at the time at which apx the 600 reaction is added.
vi). Energy flux. i. The rate of light being absorbed by the network. Since each light absorbing reaction absorbs the same amount of energy in my model, this value is just the sum of the steady state concentrations of all light absorbing species. (3rd column of heat.data).

Light absorption rate decreases to a low value which remains quite steady after apx 600 reactions have been added. It would be interesting to know which species were responsible for most of the light absorption, and what the ecology of these species was.
Is a similar behaviour observed when new reactions are not formed in proportion to the chemical potential of a species, and forced to be cyclic?
Next I consider the structural properties of the final network of reactions that results from the application of the generative rules. Remember that the above rules do not (I expect) just generate a random graph because the probability of a reaction is related to the amount of chemical potential in a given species, i.e. those species with highest free energy multiplied by concentration are most likely to undergo a new reaction. I will test how the structural properties of the chemical network so created alter differ depending on whether this assumption is relaxed. Also, this network is generated under the constraint that reactions must have as their product at least one species that already exists. I expect this to limit the "spreading" of the graph. This will be directly measurable by plotting the Neighbourhood distribution sequence.
Basic Statistics: The network contains multiple edges, i.e. two species joined by more than one edge, since the two species can be connected in by different reactions. Pajek can be used to remove multiple edges. The network consisted of N = 422 substrates (nodes) and E = 2419 edges.
4. Degree distribution of network,
i). The number of species that a species reacts with (catagorized seperately for light absorbing and heat producing reactions.). i.i average node degree ( <k> = 2 x M/n (M = max possible connections and n = actual number of connections)) , i.i.i) Average length of shortest path between two nodes <L>. i.ii). the skewness , i.iii). cumulative degree distribution. Does the degree distribution follow a power law as in scale -free networks?
-- Average degree : The average degree of the network is 11.465, and the standard deviation of the degree is 12.31. Compared to an ER graph with the same n and k this has Mean(Degree) = 11.611, SD(Degree) = 3.55.

The Prob/Cyclic looks linear in the Log Log plot for low degree. Also, the standard deviation of the degree is much greater than for the ER graph with the same mean degree.
Compare these results with random graphs, that have Average path length <L> and clustering coefficient <C> as below, (From Benko, 2002)

where <k> = 2 M/n. Are our networks generally sparce? How to L and C relate to the values of L and C for random graphs? How does L and C vary with chemical network size?
ii). Connectance (<k>) : Number of connections divided by max number of possible connections. (Equation 4 from Pawer). The density of the network is 0.027.
iii). Geodesic paths and average path length: Gij is the shortest path between node I and J, The average of these is in equation 5 (from Pawar). Characturistic path length = 2.968. A ER graph with the same n and k gives <L> = 2.731. See diagram below for L distribution of the Prob/Cyclic vs. the ER random graph.

<L> is roughtly equal to <L> in the ER graph. However, the prob/cyclic network has a diameter of 7, whereas the random graph has a diameter of 4.
iv). Clustering coefficient: The overrepresentation of triangles in the network. (equation 6, shows the CC for the ith node, again from Pawer). This measures the overall cliquishness of the graph. Clustering coefficients and standard deviations are shown below for the probcyclic and random ER graphs with the same n and k.

You can see that the clustering coefficient is roughly equal to the clustering coefficient of the random graph, but the distribution is very different, with many more species in the probcyclic graph being highly clustered than for the random graph.
2. Analysis of the Generative Chemistry with Random Generation of Reactions and No Cyclic Forcing.
The same measures are carried out as above, but for the control network where the probability of a new reaction does not depend on the total energy content of that species, and neither are new reactions constrained to have one of the products be already in existence.
Functional Analysis
1. Shannon entropy.

Shannon entropy was only defined until apx reaction 300 at which point it gave NaN, presumably due to division by zero after this point.
ii. Skew of the distribution.

Changes are much smoother than in the Prob/Cyclic case.
i). kf and heat distribution.

Note the many instances of kf = 100, due to the imposition of an artificial threshold to prevent having to use very short time steps.
ii). Distribution of free energies (left) and their relationship to steady state species concentration. MOST species that have high concentrations at steady state (and these are only very few), have low free energies.


iii). Interaction strength distribution.

Most inflow rates are through a very small section of the network, see how only a few species have high inflow rates and high concentrations at the end. Most species have very low concentrations.
iv). Mean interaction strength.

Inflow decreases reliably as new reactions are added. Reactions added later in the run never have a high inflow through them.
v). (Storage). Total Internal Free Energy of the system.

The total internal free energy present in molecules is much higher than in the ProbCyclic case.
vi). Energy flux.


The graph on the right is for the same run, but has a higher resolution Y-axis.
Basic Statistics: The network contains multiple edges, i.e. two species joined by more than one edge, since the two species can be connected together by multiple edges in the substrate graph if they occur together in different reactions. Pajek can be used to remove multiple edges. All analysis is conducted on the network after multiple edges have been removed. The network consisted of N = 1435 substrates (nodes) and E = 4260 edges.
4. Degree distribution of network,
-- Average degree : The average(mean) degree of the network is 5.937, and the standard deviation of the degree is 4.711 Compared to an ER graph with the same n and k this has Mean(Degree) = 5.827, SD(Degree) = 2.386.

The Non-Prob/Non-Cyclic generative system does not show scale-free properties. The degree distribution is very different from the Prob/Cyclic Case.
ii). Connectance (<k>) : Number of connections divided by max number of possible connections. (Equation 4 from Pawer). The density of the network is 0.004137.
iii). Characturistic path length = 4.339. A ER graph with the same n and k gives Mean <L> = 4.327 but with 5734 unreachable pairs, i.e. the network is disconnected.

Both chemical and random graphcs have similar properties. This suggests the chemical graph is random w.r.t path lengths.
iv). Clustering coefficient:

Again, both random and chemical graphs resemble each other quite strongly. Basically, we have a generative chemistry which is quite randomly assembled. It does not depend on the energetics of the particles.
Results continued here.
References
H.T.Odum (1975) Energy Quality and Carrying Capacity of the Earth, A response at prize awarding ceremony of Institute La Vie, Paris.
H.T.Odum (1988) 'Self-Organization, Transformity, and Information', Science, Vol. 242, pp. 1132-1139.
H.T.Odum (1994) Ecological and General Systems: An introduction to Systems Ecology, Colorado University Press, (especially page 251).
H.T.Odum (1995) 'Self-Organization and Maximum Empower', in C.A.S.Hall (ed.) Maximum Power: The Ideas and Applications of H.T.Odum, Colorado University Press, Colorado.
Lesins, G.B. (1991). Radiative entropy as a measure of complexity. In Scientists on Gaia, eds Schneider, S. & Boston, P., pp. 121±127.
American Geophysical Union.
Schneider, E.D, Kay, J.J., 1995, "Order from Disorder: The Thermodynamics of Complexity in Biology", in Michael P. Murphy, Luke A.J. O'Neill (ed), "What is Life: The Next Fifty Years. Reflections on the Future of Biology", Cambridge University Press, pp. 161-172
Ulanowicz, R. E. & Hannon, B.M. (1987). Life and the production of entropy. Proc. R. Soc. London B, 232, 181-192.


