©2006 All content on site protected by copyright
Book Review: Victor Niederhoffer Reviews Cause and Correlation in Biology
The book Cause and Correlation in Biology by Bill Shipley is a fascinating and enlightening attempt to try to find the true causal relations when many variables are involved in a hypothesis using methods based on directed graphs and partial correlations. This situation of unraveling multivariate causes from correlations and probabilistic dependence arising from spurious and real relations arises in all fields including market relations. In Shipley's cases most of the examples are taken from biology where one is looking for such things as whether plant cover is helpful or hurtful to survival or body size is related to mate selection or general intelligence is related to test measurements on verbal and mathematics scores.
In the case of markets, one would frequently wish to know the cause of a move in a market, trying to unravel the influence of multifarious other markets, qualitative factors, and unique current attributes.
The central method used in the book is partial correlation coefficients generated by path diagrams. Directed graph theory is used to find out which variables have separate relations with each other. A good introduction to directed graph theory would be helpful for this book and I found Chapter 6 of Discrete Mathematics with Combinatorics by James Anderson quite helpful in providing a foundation for the graphical approach contained in Shipley.
Many of the problems of unraveling true causes from the measure attributes of variables have been treated before in the context of path analysis and structured equations. Recent work by Shipley and others at Carnegie and UCLA for the first time now allows the questions to be answered much more precisely. A key variable used is the tetradic correlations: R (ab) x R(cd) - R(ad) x R(bc). By computing all such tetradic correlations and measuring their departure from 0 , one can zero in on the true underlying causes.
Shipley describes his work as comparing the relationship between cause and correlation to the relation of an object and its shadow. We use the observed correlational shadows to find what the relation might be in the populations from which we choose our sample from. To unravel it, he combines the work of directed graphs, d- separation ( paths that couldn’t account for the observed relations between vertices on a graph, and probability distributions. He solves the problem empirically by looking at partial correlations and conditional distributions between variables . His hope is that the methods "will be useful as you watch the correlational shadows dance across the screen of Nature's Shadow plays."
While there will be many mathematical improvements and augmentations of Shipley's work, the basic methods for unraveling will not change much. And considering the ideas, hypotheses, and attempts at solution in Shipley's work will broaden your understanding of what's going in the world at large and in your own field. It will expose you to ideas that were first developed in the late 19th century that now for the first time can be used to find deep causes of the complex admixture of sample variation and spurious and changing causes that we observe in the things that interest us in our work and positions.
The one shortcoming of the work is that very little is said about the problem of prediction. Many of the methods used are consistent with different models’ ultimate probabilities but they do not lead to sharp differentiations in how much better you will be able to predict the random world. While finding the underlying cause in biology can lead to basic research to improve or change the situation, finding the underlying cause in the social sciences often does not change how you would react to observed relations in the real world, although the power of your predictions might be impacted by throwing out the merely haphazard relations that are caused by more important and underlying causes.
In all, even thinking about such questions as how correlation analysis, maximum likelihood, ruling out of hypotheses by testing various partial correlations, computing tetradic algorithms, decomposing effects with path diagrams ill be very useful and enlightening. This book and related study of directed graph theory is a highly recommended extension of how we think about things that are the root of effects.