Friday, August 21, 2009

A handful of days revisited

I was hoping to get this up prior to my mention on the Millionaire Or Bust podcast. In a previous episode, a listener submitted a comment involving pulling a few choice days out of several decades of stock market performance. Naturally, I pointed to my own article on the non-sense behind it, however, I then realized that I really have yet to put any real nails in the coffin. I also located the source of the quote that kicked off the non-sense.
From my book: “Recently, a group called Dimensional Funds studied the performance of the S&P 500 from January 1970 to December 2006, during which time the annualized return of the market was 11.1%. They also noted something amazing: Of those 36 years from 1970 to 1986, if you missed the 25 days when the stock market performed the best, your return would have dropped from 11.1% to 7.6%, a crippling difference.

Now, if only we could know the best investing days ahead of time.”

As of the time of writing, the bold faced part still highlights the 36 years between 1970 and 1986. Before I go further, I suppose I should point out that I question whether or not any reputable financial agency would issue a report of this nature due to the meaninglessness of the result. Secondly, this sort of study should take at most an afternoon including the learning of some basic spreadsheet skills.
Rather than post my own spreadsheet, I'll tell you how to create your own assuming that you have a basic understanding of math, finance and spreadsheets. Scroll to the end if you'd rather just skip to the conclusions.

Step 1) download stock market data
If you manage your own investments, this shouldn't be difficult for you. Personally, I went to Yahoo! for this. Since Millionaire or Bust is American, and Yahoo! has much more S&P500 data than TSX and it's also the index referenced in the quote.

Step 2) convert point values into ratios
Assuming you have a basic understanding of mathematics, this should be a given. Going up 5 points doubles the value of an index that was 5 the previous day, increases the value by 5% for an index of 100 and for todays value we just say that things remain the same. Anyway, in the next free column, tell the spreadsheet that you want the closing value divided by the closing value of the previous day. If you prefer percentages, subtract 1 but it'll hinder things later on. Then copy and paste that down the entire column. If you've done what I did, this is column H, if your data come from a different place or you for some reason have more or less information than I do, just remember one column has been added so far.

Step 3) sort the values
How you do this will vary depending on the spreadsheet program that you use. For me, it starts with copying the column of ratios telling excel to paste special, otherwise the equations screw up. Then I select the ratios and tell it to sort.

Step 4) marking the values
Simple enough, tell the spreadsheet how many values you want to consider in your analysis. The reference says they took 25 days out of "36 years between 1970 and 1986" I have roughly 60 years of data. 25 sounds like a good number anyway. If you've got the highest values at the top, put a one in the next column by the first however many values you want, 25 in my case. Just for fun, go to the bottom and starting in a new and separate column mark the bottom 25 with 1. These two are columns I and J on my spread sheet, we've added a total of three columns so far.

Step 5) sort the values again
Put things back in chronological order, highlight the date and make sure it sorts the entire block of data. If you've done this right, your 1s should follow and you shouldn't have blocks of them at the top and bottom.

Step 6) identifying blocks of data
The assertion that I made was that if you consider the days near peak stock market gains you may actually wind up with increased returns if you miss the peaks. So make some judgement as to how many days you want to consider. Personally, I feel like going with five days before and after. So, new column now, four or K if you're keeping track. Perhaps its best I show you what's in my spread sheet and explain it after rather than explain each part as you type it in. "=IF(SUM(I103:I113)>=1,$H108,1)" occupies cell K108 in my spread sheet. If means what you'd expect it to, then there's three things or in slightly more technical terms, the if function has three arguments. The first of these is the one that decides which of the two following arguments will occupy this cell. "SUM(I103:I113)" means that you want to add all the values in that range, which covers five values above and five values bellow. The bulk of that column should be blank and adding all those values will result in 0, which is not greater than or equal to 1. Thus most of column K should take on the value of 1 or the third argument. For values around the highest gains though they should take on the value of the change in the stock market for that day.
The only peculiarity in the next argument is the $ in front of the H. In a whole bunch of spreadsheet programs, this indicates a static reference. You can copy these around and it will still point to the H column, whereas non-static references stay relative allowing for calculations to be repeated with ease.
Finally, the third argument, 1, will become pretty obvious once you're already aware of why it's there. I'll explain it later if you don't want to think about it right now.

Step 7) Repeat for losses
After that long one, a short step. Copy the column you just finished and paste it in the next one. This is L in my spread sheet and if yours is different, this is the 5th column that's been added.

Step 8) Compile your result
At the top of column K, tell the spreadsheet to multiply all the values bellow together using the product function. Then copy and paste that equation into the next column, L, to cover the big losses. This is where 1 comes in, anything multiplied by 1 is unchanged and retains it's previous value.

Step 9) Make sense of your data
If you've done exactly what I have and used data going from 1/3/1950 to 8/21/2009, you'll have 0.892222411 and 0.33212527 at the top of columns K and L. What does it all mean though? Recall that we arrived at these values by multiplying (change1) x (change2) x (change3) x ... x (changeN). Also, recall that going from the starting date to the ending date is simply the same thing, but including ever single day. For the benefit of the non-mathematically inclined, I'll switch to multiplying with * and say a * b * c * d * e * f * g. If we wanted to remove e from this we can divide the whole thing by e.
(a * b * c * d * e * f * g)/e = a * b * c * d * (e/e) * f * g
a * b * c * d * (e/e) * f * g = a * b * c * d * 1 * f * g
a * b * c * d * 1 * f * g = a * b * c * d * f * g
At any point of convenience, enter =1/whatever cell your result is located in. I get 1.120796774 and 3.010912118 respectively. In other words, if you were trying to miss the best trading days of the past roughly 60 years, but could only pull that off by missing the five days before and after, you would have 1.12 times the amount of money you have assuming that you missed few dividends during the few days you were out of the market, or 12% more. And the other for fun result, if you miss all the worst days and the five before and after that you'd have three times as much money. If you've done this with another index, different number of peak gains, or different number of days before and after removed, please post a comment and share your result.

Conclusion
Finally, we hit what does this all mean in the bigger picture. There's a lot you can walk away from with this. It's much easier to make a case for trying to time the market than against it. That point however is completely unsupported by the way the data was analyzed as stock market values are not posted in advance and you have no way of knowing whether there is a peak gain/loss day coming up. The point that I'm trying to impart the idea that you really need to think about whether or not the methodology used to get the numbers actually support any sort of meaningful conclusion. I think it's best summed up as, "It's important to think about things you see, after all you can use statistics to prove anything, including a lack of critical thinking." Other lessons include:
-statistics are messy when you don't know what you're doing
-spreadsheets can be fun
-make a fantastic claim and someone out there will believe you if you sound credible
-someone out there will waste an evening challenging your claim
-I suck at considering my audience when explaining things and jump around with different assumptions regarding knowledge levels
If you're still with me at this point, please leave a comment about other lessons learnt here.
For another cool curiosity figure, start over and put current market value/starting market value somewhere in your spreadsheet. Proceed to step 3 and tell it to multiply the first hundred or so values. Increase and decrease accordingly to figure out how many of the best days you'd need to miss to wipe out all your gains completely. Pulling out 25 days like whoever did that first study did was cool, but if you continue ignoring dividends like many of these so called studies do, pull the best 101 days out of the nearly 60 years that I used in my "study" you don't significantly reduce your gains, they're gone completely. Pull the best 125 days out of the 15006 days and you'll have half the money you started with. Makes you wonder about the claims people make. As for the I Will Teach You To Be Rich guy, that claim he'll teach you to be rich, there's no lie there, just remember that some people teach by example. Now I'm kind of curious if "36 years from 1970 to 1986" made it into the book.

No comments: