More on Volatility and Correlation, from Alston Mabry

November 6, 2007 |

I wonder how to accurately count, measure, market fear as a basis for building theories…. Which brings us all the way back around to volatility and correlation and a little bootstrapping/Monte Carlo.

In order to show how correlation increases with variance, Boyer, Gibson and Loretan (Pitfalls in Tests for Changes in Correlation , FRB, 1997) use randomly-generated data first, and then extend their results to real data, that data being the Yen/USD and DM/USD rates from 1991-1998. (Not sure why they chose this data set, but choosing two series that are ratios with the same denominator should give the series a correlation of about +0.5, as Pearson first conjectured.)

BGL look at "months" (20-trading-day samples) to show that higher-volatility months in the real data also have higher correlation. Key differences between real data and random data (which differences BGL do not explore) are that (1) higher-volatility months tend strongly to be down months (there's the fear for you), and (2) real data shows extremes of subsample standard deviation greater than that of randomly-generated data or re-sorted real data (volatility clustering - all the elephants spook at once).

(1) Example: S&P, 1950-2007. BGL calculate what they call "k", which is simply the ratio of the sample variance (in this case, each 20-tday period) to the population variance. Breaking the S&P into 20-tday periods and sorting by k, produces the follows stats for the quartiles (mean k of the quartile / mean % change of quartile / z of mean % change):

mean k / mean % / z

2.05 -0.61% -4.35

0.99 +0.70% +0.05

0.66 +0.88% +0.66

0.38 +1.78% +3.65

So, high-variance months are substantially negative compared to low-variance months. Here's the same analysis done with the Yen/USD series for
1980-2007:

mean k / mean % / z

2.06 -1.15% -3.00

0.94 -0.01% +0.38

0.61 +0.07% +0.61

0.35 +0.54% +2.01

(2) Below is R code for reading in a series of log% changes, calculating the sd's of all subsamples of a set length, e.g., 20 days, and then getting the max and min sd's for that set of subsamples. This shows the high and low values of volatility for this subsample length for this series. Then the series is randomly re-sorted 999 times, and each time the x-day-length subsample sd's are measured again. In the end it displays what percentile the actual max and min sd's fall at, relative to the max and min sd's of the randomly resorted series.

In all cases I've run, with various series, the max sd of the actual series is greater than all max sd's of all random runs, and the min sd's of the actual is usually at about the 99.6 percentile or higher. For example, for the 1991-1998 Yen/USD series, the actual max sd for 20-day periods was at the 100th percentile, and the min sd was at the 99.9 percentile.

________________

# determines the sd of all possible subsamples
# of a given length and also does random
# re-sorting to estmate distribution of possible
# max and min sd's

# read in the data, without column header
# for generic use:
s1 <- read.table("https://dailyspeculations.com/Data.txt", header=FALSE)

# variables needed:
# the sub-sample length (20 days)
ss1 <- 20

# number of sim runs:
# 999 so we can add the actual for 1000
numruns = 999

# other variables: L <- length(s1$V1) L1 <- L-ss1 cL1 <- length(L1) cL2 <- length(L1) maxsd <- length(numruns+1) minsd <- length(numruns+1)

# the outer loop gets the sd's of all
# subsamples of length(ss1)
for (i in 1:L1){ cL1[i] <- sd(s1$V1[i:(i+ss1)])
}

# the inner loop does the random re-sorts
# and gets the sd's of all subsamples of
# length(ss1)
for (j in 1:numruns)
{
randvec <- rnorm(length(s1$V1)) s2 <- data.frame(rand=randvec,pct=s1$V1) s3 <- s2[ order(s2$rand,s2$pct),]

for (k in 1:L1){ cL2[k] <- sd(s3$pct[k:(k+ss1)])
}

# collects the max and min sd's from each
# random run
maxsd[j] <- cL2[which.max(cL2)] minsd[j] <- cL2[which.min(cL2)]
}

# add the actual max and min sd maxsd[1000] <- cL1[which.max(cL1)] minsd[1000] <- cL1[which.min(cL1)]

# calculate and display the percentile
# of the actual max and min sd relative to
# the random runs
maxpos <- maxsd[maxsd > cL1[which.max(cL1)]] minpos <- minsd[minsd < cL1[which.min(cL1)]]
(1000-length(maxpos))/10
(1000-length(minpos))/10

Recent Posts

List of Authors

Nov

6

More on Volatility and Correlation, from Alston Mabry

Comments

Archives

Resources & Links

Search