Showing posts with label economics. Show all posts
Showing posts with label economics. Show all posts

Friday, May 28, 2010

Preferential attachment in ecosystem mutalism



Tree, vine, and bromeliad

In a previous post I made an analogy to illustrate the evolutionary pattern of "attenuated parasitism" -- a model for how antagonistic agents can evolve cooperation. I created that analogy while sitting on the edge of the Puerto Rican rain forest thinking about how it is that vines and trees can co-exist -- after all, being a vine is such a good strategy that it seems like they should have killed all the trees (and themselves) by now! That same day I had another idea that I didn't get around to writing down: such cooperative mechanisms should, given time, lead the forest to become one giant interconnected web of mutualist interactions.

The argument is simple. If a vine and a tree form an alliance while the same tree specie and some other specie, say, a bromeliad also form an alliance then it is logical that the vine and the bromeliad will have an increased probability of forming an alliance of their own owing to the simple fact that they co-occur in the same location (around the tree) more frequently than other random specie pairings. A small interaction network like this would likely grow by accumulating more and more interactions by the same mechanism.

There is a mathematical theory of such network growth called "preferential attachment" (also known by about half a dozen other names). The study of such networks dates back to at least Yule, 1925 who shows that such processes build what came to be called "scale free networks". Such networks end up with many more hyper-connected nodes than one might intuit. For example, the tree in the previous discussion is likely to become one such hyper-connected node with many, many mutualistic interactions both literally and metaphorically hanging off of it.

It is easy to imagine how the growth of such cooperative interdependence would tend to drive an ecosystem towards a single giant interconnected web of specialists. It is conversely hard to imagine how an outside generalist specie could successfully invade such a well-connected ecosystem. Therefore one would expect to see more internal differentiation from species whose interconnections into the web are already established as opposed to generalist invasive species whose ancestors came from the outside.

I haven't studied the field evidence well enough to know if this idea is supported or not. Following are a few random Google hits on the subject that I've only glanced at long enough to think that there's plenty of room for theory development in this field!

Sunday, April 25, 2010

Biological comparative advantage simulations 1

I gave myself a few days break from Traitwise coding this week which was enough time to think about some science once again. As a result of a talk with Andy I got around to coding up a simple kinetic simulation of a hypothetical trading system between two cellular "agents". It is far from a completely justifiable molecular system but it does now capture the heart of the matter. It was a little trickier to write than I had imagined -- as usual there was some divide between the hand-wavy description and the accounting of each important detail.

Background
From economics arises a lovely little non-obvious theory called "comparative advantage" that demonstrates how trading pressure arises. I have a stock illustration of this which involves an old man and young man living on an island. (Yesterday I went and looked at the Wikipedia article and saw a very similar description and it took me a while before I remembered that I was the one who had written that part of the Wiki article!).

Imagine an old man and a young man are on an island. Suppose the following:
  • there's only two products needed by both: fish and water
  • neither man is close to satiated: i.e they want as much of both products as they can get
  • they both consume with a 1:1 ratio of fish and water
  • neither has or will ever meet the other
Each man makes a choice: he could only gather water and thus catch no fish or he could only fish thus gather no water or -- realistically -- he could find some combination of the two activities that maximizes his consumption. The area of "production possibility" can be visualized as in the following graph. The young man can allocate his efforts anywhere under the blue line and the old man anywhere under the red line. Under the assumption that neither man is satiated then the optimal production for each is at the intersection of the consumption line and the edge of their production possibility area. (Click to enlarge figures.)



One day the young man is hauling water back from a spring and stops for a break at a big rock. He accidentally forgets one of his buckets and continues on his way. The next day, the old man who knows nothing of the presence of the young man, stumbles upon the lost bucket of water while returning from a fishing hole. The old man values water more than fish because for him to gather one unit of water costs him two units of fish. Seeing the precious water just sitting there he decides to abandon two of his fish and picks up the water and heads on his way. A little while later the young man returns along the path and sees that fish sitting there on the big rock. The young man values fish more than water because if he wants a fish it costs him two units of water so he decides to leave two units of water behind and pick up that fish.

You can imagine this pattern of leaving-behind one thing and taking-the-other could repeat itself day after day. Both men are better off engaged in this trading game -- they both increase their consumption. Note that the men do not need to know about the others existence -- as far as they are concerned each are simply abandoning a less valuable product and picking up a more valuable one. The pressure to trade does not require understanding the mechanism nor personally engaging with the trading partner. Rather, the pressure to trade is a mathematical consequence of the comparative cost of production each agent. Given the simplicity, I imagine that such a "leave and take" trading system could easily exist between species (say micro-organisms trading metabolites) or between agents of the same species (such as members of the same microbial colony) given differences in micro-environments.

Consider the long-term consequence of our two-man island "economy". The old man finds that he can consume more of both fish and water by specializing in fish production and dropping the fish off at what he thinks of as the "magical" fish-to-water conversion rock. Likewise the young man finds he can consume more of both fish and water by specializing more in water and dropping off some water at what he thinks of as the water-to-fish conversion rock. Each would begin to push his production towards his personal comparative advantage as illustrated below.



The amazing thing about trade is that both parties end up consuming outside of the limits of their own production. Furthermore, the trade benefit is mutual despite the fact that the young man is better at both activities! This result may be counter-intuitive but perhaps is more obvious when stated as "a group is best off when everyone in the group is working in a way that maximizes the time spent on their best abilities." Note that this is not the same thing as saying: "everyone should work on nothing but what' they're best at" as that statement does not account for the fact that the consumption may not be maximal in that situation. There is *some optimal mixture* that is prejudiced towards each agent working on their best ability but is unlikely to be at the extreme for everyone.

Consider the limiting case of the two-man island economy: as specialization occurs, the old man will hit a vertex first; that is, he will end up doing nothing but fishing. In contrast, the young man will end up fishing *and* gathering water albeit more water and less fishing than would have been the case if there was no trade.

The long term consequences of this hyper-specialization can be profound.

If the old man completely abandons gathering water then he may lose his ability to do so. For example, his buckets may go un-mended, he might lose track of where the good springs are, etc. In the short-term the trading is mutually beneficial but in the long-term the old man's situation may become brittle. That is to say, if the trading were to suddenly stop for some reason (for example, the young man might be injured) then the old man might suffer a severe short-term inability to gather water. In the worst case, he might die before he could re-establish his water gathering skills.

Thus the short term benefits of trade may also lead to longer-term danger. It is this conundrum that should be at the heart of free-trade vs. protectionist debates. Both extremes are correct: free traders are correct that trade benefits everyone but protectionists can be correct to argue that free trade may create dangerous, brittle, dependencies. It is reasonable social policy to find a balance that reaps some fraction of trade benefits while also avoiding over-specialization as insurance against future catastrophe.

That said, one almost never hears the above reasonable trading arguments. Instead one hears a much more simplistic and selfish argument regarding "taking away jobs" vs "cheaper products". As the island economy story demonstrates, engaging in trade does not reduce total labor (both men are still working full-time in the island economy) yet it does increase total consumption. That said, in a macro-scale re-telling of the same story the specialization by the two sides would imply that one sector of the economy (say, the fishing industry) would either have to switch its investments and labor to water production or the labor and capital would need to be allowed (and be willing) to move from one area or country to the other. In reality, neither option is in the short-term best interests of the most effected minority -- the owners of the capital and and labor employed in the existing industry. Therefore one (quite reasonably) hears the loudest anti-trade voices coming from those few who are greatest effected. They're best rational argument against trade would be to demonstrate that the proposed trade was not in the long-term interest of their society by an over-specialization argument. Alas, the best rational argument is too rarely the best emotional argument and so instead one usually hears anti-trade rhetoric as status-quo preserving, supposedly job maintaining, often nationalistic, calls against some particular free-trade agreement.


Back to biology
The contrived island economy story is of course an over-simplification of any real human economy; however, it isn't that far at all from a plausible biological story.

Imagine two humble cells who find themselves in some micro-environment where metabolites might be exchanged in an unconscious "market" akin to the magical trading rock on the island economy. Instead of fish and water, imagine the cells exchange two kinds of molecules, let's say two kinds of amino acids.

Suppose that each of these cells can make make either amino acid on their own or their can import or export either of these. With no trade, each would homoeostaticlly regulate their production so as to match their consumption at the limit of their production as demonstrated in the following figure (a copy from above).

Using some hypothetical trading mechanism one imagines that they would be better off to export their lessor valuable products into the inter-cellular space when the are able to import their more valuable product. That said, they would have to defend themselves against parasites that took from the environment without returning any benefit (we'll return to this.) I suggest that a molecular implementation of such a trading mechanism is fairly easy to imagine and it is this that I've been simulating.

Suppose there's a molecular importer that, as a result of importation, has a side reaction. Suppose this side reaction regulates an exporter of a different product. The ratio of the import quantity to the export quantity is the "price function". Let's imagine that the price is established by some other mechanism, for example it might be hard-coded as a result of evolutionary pressure or perhaps it might be determined by dynamic measurement by cellular hardware . Further suppose that the exporter leaks a little bit to "advertise" the existence of the exported product. If two cellular agents implemented this same mechanism yet had complementary price functions, the system should spontaneously engage in unconscious and mutually beneficial trade as described above.





Simulation
I wrote a first approximation of this machinery in Matlab using the following differential equations.

kn = arbitrary rate constants

M0 = slope of the production possibility line of the "old" cell (the one with the worse production abilities)
B0 = y-intercept of "old" cell
My = slope of the production possibility line of the "young" cell (the one with the better production abilities)
By = y-intercept of "young" cell

So1 = storage of reagent 1 by the "old" cell
So2 = storage of reagent 2 by the "old" cell
Sy1 = storage of reagent 1 by the "young" cell
Sy2 = storage of reagent 2 by the "young" cell

Po1 = production of reagent 1 by the "old" cell
Po2 = production of reagent 2 by the "old" cell
Py1 = production of reagent 1 by the "young" cell
Py2 = production of reagent 2 by the "young" cell

E1 = environmental concentration of reagent 1
E1 = environmental concentration of reagent 2

Ro = "old" production ratio = Po2 / Po1
Ry = "young" production ratio = Py2 / Py1

With rules as follows...

"import in proportion to the concentration gradient between inside and outside of cell, never exporting more than you have"...
importO2 = max( 0, k1 * ( E2 - So2 ) )
importY1 = max( 0, k1 * ( E1 - Sy1 ) )

"leak a little bit to 'advertise' the availability of less valuable product"....
leakO1 = k2 * S01
leakY2 = k2 * Sy2

"export in price proportion to what's imported but never more than you have"...
exportO1 = max( 0, min( S01-leakO1, 2 * importO2 ) ) * (e ^ -k5 * E1)
exportY2 = max( 0, min( Sy2-leaky2, 2 * importY1 ) ) * (e ^ -k5 * E2)

"Slow exporting if the external concentration grows too high"....
exportO1 = exportO1 * (e ^ -k5 * E1)
exportY2 = exportY2 * (e ^ -k5 * E2)

"Consume in a 1:1 ratio. Consumption this limited by the least available product"...
consumeO = min( S01, S02 )
consumeY = min( Sy1, Sy2 )

"Regulate the production, moving along the production limit line so as to move the storage ratio towards a 1:1 ratio. (Bound production so that it doesn't go negative)"...
targetPo1 = max( 0, min( -Bo/Mo, Ro * Bo / ( 1 - Ro * Mo ) ) )
targetPo2 = Mo * targetPo1 + Bo
targetPy1 = max( 0, min( -By/My, Ry * By / ( 1 - Ry * My ) ) )
targetPy2 = Mo * targetPo1 + Bo


Integrate with the following 10 differential equations...

d/dt So1 = Po1 - exportO1 - leakO1 - consumeO1
d/dt So2 = Po2 + importO2 + consumeO2
d/dt Sy1 = Py1 + importY1 + consumeY1
d/dt Sy2 = Py2 - exportY2 - leakY2 - consumeY2

d/dt Po1 = k3 * ( targetP01 - Po1 )
d/dt Po2 = k3 * ( targetP02 - Po2 )
d/dt Py1 = k3 * ( targetPy1 - Py1 )
d/dt Py2 = k3 * ( targetPy2 - Py2 )

d/dt E1 = exportO1 - importY1 + leakO1
d/dt E2 = exportY2 - importO2 + leakY2


Results
The result is more or less as expected. If you remove exporting, importing, or leaking, the cells will move to their respective maximum production/consumption points. If you permit the trading, they will adapt to each others exports and regulate their production. Each ends up consuming outside of their personal production hulls as predicted. The "leak" rate can be very small -- it changes only the time it takes before they find each other. Once the symmetry is broken by the leak, the trading regime can emerge rapidly.



One part of this system that I initially omitted turned out to be very important (and obvious in retrospect!) -- the cells must monitor the external concentration of the exported product. If they export based solely as a function of their imports then as the system reaches saturation the "young" one will continue to export when there is no ability for the "old" one to consume any more. As a result, the external concentration of the "young" cell's export product will rise indefinitely thus preventing that cell from increasing it's consumption further due to wastefully dumping product into the environment. At saturation, the exporter must recognize that the exported product is not being used (for example, my measuring high external concentration) and respond by attenuating its export rates.

Returning to parasites. Imagine that one of these agents or some third party tries to "cheat" the system by consuming the export products found in the environment but not exporting anything of its own. In that case, the exporters would reduce their own exports because they're would not be receiving sufficient inputs. In other words, the system is robust against cheaters -- it simply returns to the state as if the trading paradigm didn't exist. Thanks to the small leak, if the cheaters disappear the trading will resume. In this regard, the simple proposed system is an analog computer version of "tit-for-tat".

Next I'm going to try to build a more "molecularly" realisitic simulation and then run many of these systems in parallel. In particular I want to simulate the group behavior in plausible experimental setups such as in a petri dish or chemostat. Especially on a surface, I suspect that spatial constructs will spontaneously emerge as waves of cooperation and defection propagate around the environment as has been shown in similar digital simulations by Axelrod.

Sunday, May 10, 2009

Understanding Principal Component Analysis via cool Gapminder graphs



Gapminder.org is a wonderful site full of "statistical porn". This chart in particular is a fascinating graph that demonstrates the correlation between income and child mortality rates. It is also a great example to teach about a cool statistical tool: "Principal Component Analysis".

In this graph of regions there is an obvious negative correlation between infant mortality and income illustrated by the fact that the data points scatter along a line from upper left to lower right. In other words, if you knew only the infant mortality rate or the income of a region you could make a reasonable guess at the other.

Principal Component Analysis (PCA) is a statistical tool that’s very useful in situations like this. PCA delivers a new set of axes that are well aligned to correlated data like this -- I've illustrated them here with black and red lines. For each axis, it also returns a “variance strength” which I’ve represented as the length of the black and red axes. (Actually I just hand approximated these axes by eye for the purposes of illustration).

The strongest new axis returned by PCA (the black one) aligns well with the primary axis of the data. In other words, if one were forced to summarize a region with a single number it would be best to do so with the position along this black axis. The zero point on the axis is arbitrary but is usually positioned in the center of the data (the mean). Positive valued points along this black axis would be those regions further toward the lower right and negative valued regions would be those further toward the upper left. Let’s call this new axis “wealth” to separate it in our minds from “income” which is the horizontal axis of the original data set. Increases in “wealth” represent an increase in income and drop in infant mortality simultaneously.

The second axis returned by PCA is shown as the red axis. Countries that lie far off the main diagonal trend-line (black axis) have particularly unique infant mortality rates given their wealth which we’ll assume is because of something unique about their health care systems. Points well below the black axis are regions that have very good health care given their wealth and those above it have particularly poor health care given their wealth.

Because PCA gives us convenient axes that are well aligned to the data, it makes senses to just rotate the graph to align to these new axes as illustrated here. Nothing has changed here, we've simply made the graph easier to read.



Before you even look at specific regions on these new axes, one could guess that socialist countries would score more negatively along this red axis and those whose economy is heavily biased towards mineral extraction -- where income tends to be very unevenly distributed -- would score more positively. Indeed, this is confirmed. The most obvious outliers below the black axis are Cuba and Vietnam where communist governments have directed the economy to spend disproportionately on health care and the outliers on the other side are: Saudi Arabia, South Africa, and Botswana -- all regions heavily dependent on resource extraction where the mean income statistics hide the reality that few are doing very well while the vast majority are in extreme relative poverty.

One particularly interesting outlier is Washington DC which is located as far along the red axis as is Botswana! In other words, based on this realigned graph, you might guess that the wealth in DC is as unevenly distributed as it is in Botswana. Fascinating! (The observation is probably at least partially explained by the fact that it is the only all urban "state" and urban areas will tend to have wider income distributions than rural/suburban areas.) Also note that all of the points in the United States (orange) are well into positive territory on the red axis -- our health care system is as messed up relative to our wealth as is Chad, Bhutan, and Kazakhstan -- countries with completely screwed-up governmental agendas. Think of it this way: the degree to which our infant mortality rates are "good" owes everything to our wealth and is despite the variables independent of wealth! In other words, countries that provide average health-care relative to their wealth like El Salvador, Ukraine, Australia and the UK fall right on the black axis but we fall significantly above that line -- roughly the same place as countries that are, independent of their wealth, really messed up like Chad and Kazakhstan. (A caveat: the chart is on a log scale so the comparative analysis is more subtle than I'm making it out here.)

PCA returns not only the direction of the new axes but also the variance of the data along those axes. To understand this, imagine for a moment that all the regions of the world had exactly the same health care given their income; in this case all the points would align perfectly along the main trend line (the black axis) and the variance of the red axis would be zero. In this imaginary case, the data would be “one dimensional”, that is income and infant mortality would be one in the same statement; if you knew one, you'd know the other exactly. Now imagine the opposite scenario. Imagine that there was no relationship at all between income and infant mortality; in that case we would see a scattering of points all over the place and there wouldn’t any obvious trend lines. Neither of these imaginary scenarios are what we see in the actual data. It isn’t quite a line along the black axis but neither is it a buckshot scattering of points, so we can say the data is somewhere between 1 dimensional and 2 dimensional. If both variances are large and equal to each other, then the system is 2 dimensional while if one of the variances is large while the other is near zero, then we know the system is nearly 1 dimensional. In other words, PCA permits you to summarize complicated data by finding axes of low variance and simply eliminate them. This technique is called “dimensional reduction” and is a very powerful tool for summarizing complicated data sets such as would arise if we looked at more than two variables. For example, we might include: car ownership, water accessibility, education, average adult height, etc to the analysis at which point performing a dimensional reduction would help to get our heads around any simplifications we might wish to make.

Monday, March 30, 2009

An Idea: Federal Reserve Random Moves


Reference historical DJIA (log scale)

Self Organized Criticality (SOC) is a model to describe the dynamics of certain kinds of systems built out of many interacting non-linear actors. The "sand pile" model" relates the frequency of avalances to their magnitude by 1/f (i.e. avalances happen with inverse frequency to their size).

It seems intuitive that economic systems should also show this "sand pile" behavior and this paper claims that stock markets do indeed show "near-self organized critical" behavior. (The exact function is not relevant to my argument.) This intuition for this comes from the fact that each economic actor relies on others in a complicated web of interactions. The value of assets in the system are subjective and are strongly biased by the perception of other actor's subjectively valued assets. Moreover, the perceived future value of those assets is a strong function of the cultural perception of the unknown future. In other words, the macroeconomic system is in a strong, multi-scale, positive feedback.

In the sandpile model, a few grains of sand will end up holding an enormous load of upstream stress and therefore their perturbation will create large avalanches. Analogously, a few economic actors (insurance companies, banks, hedge funds, etc) will end up with an enormous load of upstream dependencies that will similarly cause avalanches if they are disrupted.

In the sand pile model one can imagine a large conical basin of uphill dependencies resting on a few critical grains -- those critical bits are the ones that are "too big to be allowed to fail". Playing very loosely with the analogy, the stress on a gain from its uphill neighbors is analogous to the balance sheets of an economic actor. But not exactly. In the sand pile, all potential energy is explicitly accounted for -- there's no hiding the cumulative stresses due to the weight of each particle. This is not true in the economic analog. Real balance sheets do not account for total stresses because complicated financial transactions (like mortgages and insurance contracts) contain off-balance-sheet information that is usually one-way. For example, when a bank realizes that there is risk in a mortgage they will pass on this cost to the uphill actor but when a debtor realizes that there's more risk (for example, they might know that their financial situation is not as stable as it appears on paper) they will not pass along this information. In other words, there will tend to be even more uphill stress than is accounted for by the balance sheets of each downhill actor.

Now the point.

If you wanted to reduce the number of large scale catastrophic avalanches in the sand pile model, the method for doing so is easy: add noise. The vibration of the sand pile would ensure that potential energy in excess of the noise energy would not be allowed to build up. It's the same idea of forest management -- lots of small fires prevent larger ones. Therefore, by analogy, a good strategy for the Federal Reserve might be to similarly add noise. Conveniently, this "add noise" strategy is inherently simpler to execute than is their current strategy -- they would simply roll a die every few months and change the discount rate by some number between zero and ten percent.

Crazy? Well, as it stands now, the Federal Reserve operates under the belief that it can act as a negative-feedback regulator of the macro economic system. The idea is sound, but based on my experience attempting to control even very simple systems, I'm skeptical of the reality. To begin with the obvious, the economy is anything but simple. Furthermore, the Fed does not have, never has had, and never will have, an accurate measurement of the economy. To wit: it neglected the huge volume of CDOs built up in the last 10 years, and the S&L stress of the 80s, and the tech bubble of the 90s, etc, etc. History shows that there have always been, and will always be, bubbles and newfangled leveraging instruments so anything short of draconian regulation that stopped all financial innovation (which would be worse) will not be anything but reactive. But it gets worse. There are also large and unpredictable latencies in both the measurements and the results of the Fed's actions. Even in simple linear systems, such latencies can have destabilizing effects and since the macro economic system is highly non-linear and constantly evolving the effects are essentially unknowable apriori.

In summary, I suspect that the macroeconomic system is not directly controllable in the way that is envisioned by the creators of the Federal Reserve due to non-linearity, poor measurability, and latency. Therefore, given that the economy probably has some SOC like organization, I suspect that random Fed moves would probably be no worse than the current strategy and would probably be better.