A New Statistical Framework

Daily Speculations The Web Site of Victor Niederhoffer and Laurel Kenner

Home

The Chairman
Victor Niederhoffer

About Victor Niederhoffer

Write to us at:

(address is not clickable)

05-Sep-2006
A New Statistical Framework for the Market, by Victor Niederhoffer

We are often confronted with a series of observations that form the basis for a decision. The classic example is the series of defective and acceptable products that come out of some manufacturing process. Another classic would be the lifetimes that occur when groups of patients are confronted with two treatments. In our own field, a series of prices from markets occurs, and it is helpful to inquire where it's going, what the process underlying it is, and what treatments might have described or predicted it.

A class of statistics called sequential statistics has been developed to deal with decisions, estimates and procedures for a series of observations where the decision is made before the end of the process. Sequential Statistics (University of Kentucky, 2004) is a provocative and challenging book by Zakkula Govindarajulu and covers the subject well. The book starts with a helpful introduction in which examples are given, the problems are put in perspective and a framework is provided. The author describes two-stage procedures where you stop at a certain point, get your bearings and estimates, and then continue sampling until a specified stopping rule is reached.

The second chapter is on the classic ways of treating such problems. The Sequential Probability Ratio Test is a procedure where the probability of observations coming from two different hypotheses is continuously calculated. When the ratios get beyond certain control bands, a decision is made that a difference in parameters exists. For example, after 10 observations of a heads-and-tails process, with 10 heads, it's 40 times more likely that the process came from a 0.90 heads process than a 0.50 process. So if one observed 10 out of 10 heads, one would be 97.5% confident that the process came from the 0.90 process.

The third chapter covers decision-making where there are not two simple alternatives but composite hypotheses concerning a range of values for the parameters. This chapter gives Bayesian procedures for non-parametric decision-making including ranks and sign tests, as well as methods for deciding between more than two hypotheses.

The fourth chapter covers methods of estimating parameters and confidence regions as you're going along in a sequence, including methods for determining regression estimates on a going-forward basis. Generalizations to cover decision-making with different loss functions and multiple significance tests are covered.

The fifth chapter covers applications to biostatistics, including problems dealing with optimal dosages, and lifetimes that arise from different treatments. Particular reference is made to a procedure called the up and down rule where you keep changing the dose level up or down by a unit based on whether it improved or hindered the outcome between the last measurements. The final chapter gives code and descriptions of Matlab programs that can be used to implement the major tests and procedures covered in the book.

Most of the methods used in the book can best be derived by simulation. That is, you would take a random series of numbers, and look to see what kind of statistics would be forthcoming based on the underlying process that determined the sequence at a given point. Signs, differences between means, variables and ranks of competing hypotheses would be calculated, and based on repetition, one would generate probabilities of occurrence for the statistics, and confidence regions for decision-making.

There are few if any references to uses of sequential statistics in the literature. Aside from their use in biostatics and quality control, most of them appear to be in the field of linguistics, where it is interesting to consider how people understand words based on the constituent phonemes involved. I believe there is a wide range of uses for sequential statistics in market decision-making. One important example would be in systems analysis where you are trying to decide whether a system is helpful or whether a phenomenon occurred. How far back do you go before you make a decision? Certainly not to the beginning of a period, because by then your data would be out of date. But you don’t want to stop too early either, because then your data will be subject too much to chance. Given you've stopped at a certain point, how far do you wish to go forward before you conclude that the process has changed?

On another front, what confidence do you have that certain levels will be reached given the sequence that occurred in the past? Procedures similar to estimating the maximum of a 100 numbers when you're confronted with just the first n are similar to those that a student of sequential statistics would derive, but the procedures here are much more robust and precise.

Our whole field cries out for sequential statistics, and simple methods for applying them in practice would be useful to all.

“Sequential Statistics” often draws upon theoretical work in advanced calculus, probability theory, measure theory and statistics. It's not light reading or accessible to those who aren’t already versed in the field. However, it will get you thinking about a class of problems, and provide useful guidelines, tables, and types of procedures and decision rules that should be immensely useful in quantitative finance and counting.