Friday, January 17, 2014

Business writers fooled by randomness

Earlier this week, the News-Record, Triad Business Journal, and Ed Cone reported on the "good news" of Winston-Salem moving up 38 places in the Milken Institute's 2013 ranking of "best performing large cities." In particular, Winston-Salem's ranking improved from from 174th out of 200 in 2012 to 136th in 2013.

These types of rankings warrant much more skepticism than the business press gives them. This is especially true of the Milken report, which bases its rankings on an index compiled primarily from estimated numbers on job growth, wage and salary growth, and "high-tech GDP" growth for metropolitan statistical areas.
Consider the estimates of job growth. These are constructed by taking the difference of estimates of the number of jobs in an end period and a base period. Because each of the estimates is based on a sample of employers and not all employers, they are subject to sampling variability (much  the same way that public opinion results are being correct within a few percentage points). Sometimes the estimates are a little higher than the true value; sometimes they are a little lower. If the estimates are unbiased and the errors are uncorrelated over time, very high errors on the positive side tend to be followed by lower errors or errors on the negative side, and very low errors on the negative side tend to be followed by higher errors. This causes a phenomenon known as "regression to the mean" in which big positive or negative growth rates in one period tend to be followed by smaller growth rates in the next period.

Regression to the mean issues become even more pronounced when the underlying statistics are rankings and have upper and lower boundaries. For cities at the bottom of the rankings, there's no place to go but up (logically, Winston-Salem couldn't have moved down 38 places). Similarly, for cities at the top, there's no place to go but down.

The Milken Institute lists the 20 cities than gained the most positions from 2012 to 2013; of these, 19 came from the bottom half of the distribution. Of the 20 biggest losers, 14 came from the top half of the distribution.

These problems are magnified with there is more variability in the estimates. Measured variability decreases with sample size. Unless special provisions are taken to over-sample people and businesses in small cities, the estimates for cities like Winston-Salem will jump around more than their larger cousins. Indeed, Winston-Salem's has jumped all over the place across the years--136 in 2013, 174 in 2012, 164 in 2011, 119 in 2010, and 92 in 2009.

Not only are the data that form the Milken Index subject to variability, but some series are incomplete and preliminary. In particular, several parts of the index are based on "high-tech GDP" growth. Milken does not describe the specific series that are used but does indicate that they come from the Bureau of Economic Analysis (BEA) metropolitan GDP estimates. The BEA figures are not especially timely. The latest data are advanced estimates for 2012 that are not final and are subject to substantial revision. Also, estimates for the sets of industries that would be needed for a "high tech" figure aren't released for all cities in every year.

Finally, whatever information can be gleaned from the index is almost all dated. There are nine components of the 2013 index. Only one actually comes from 2013. Six components describe growth rates that ended or outcomes that occurred in 2012, and two components describe growth rates that ended in 2011.

Did Winston-Salem really change from being one of the best performing cities in 2009 to one of the worst in 2012? I doubt it.

Business writers should doubt it too.