Friday, September 30, 2011

IRCPPS in the Links: Andrew Gelman and Nate Silver on Predicting Presidential Elections

via Gelman's blog and Silver's 538:

Both are responding to articles about Alan J. Lichtman, a presidential historian at American University who offered up a model for predicting winners in his 1992 book 13 Keys to the Presidency, claiming in an interview that Obama is a lock to win in 2012.

Silver writes:
There are several problems with this model, and its results should be taken with a grain of salt.
First has to do with the nature of the keys themselves, several of which are quite subjective...

This is most obvious in the case of the final two variables, which have to do with the charisma of the incumbent candidate and an opponent. I’m not of the view that there’s no such thing as “candidate quality,” or that charisma doesn’t matter. But it’s awfully easy to describe someone as charismatic when he or she is ahead in the polls — or when you have the advantage of hindsight and know who won an election.

Mr. Lichtman, for instance, scored Mr. Obama as charismatic in 2008, but not John McCain (even though Mr. McCain, with his service in the Navy, might have met Mr. Lichtman’s description of a “national hero”). However, Mr. Lichtman does not score Mr. Obama that way now...

The heart of the problem is something a little different, however: the formula is not actually all that accurate. Although it may have gotten the winners right, it does not do particularly well at accounting for their margin of victory...

Although none of Mr. Lichtman’s keys are intrinsically ridiculous (for example, “which candidate had more ‘n’s in their name”), one can conceivably think of any number of other areas that might have been included in the formula but which are not — looking at how messy the primaries were for the opposition party, for example, or the inflation rate, or the ideological positioning of the candidates. (I mention these particular ones because there is some empirical evidence that they do matter.)

If there are, say, 25 keys that could defensibly be included in the model, and you can pick any set of 13 of them, that is a total of 5,200,300 possible combinations. It’s not hard to get a perfect score when you have that large a menu to pick from! Some of those combinations are going to do better than others just by chance alone.

In addition, as I mentioned, at least a couple of variables can credibly be scored in either direction for each election. That gives Mr. Lichtman even more flexibility. It’s less that he has discovered the right set of keys than that he’s a locksmith and can keep minting new keys until he happens to open all 38 doors.
 Gelman adds:
The Lichtman stuff is ok in the sense of generally getting things right without having to be quantitative–but it has one thing that really bugs me, which is the attempt to predict the winner of every election. In the past 50 years, there have been 4 elections that have been essentially tied in the final vote: 1960, 1968, 1976, and 2000. (You could throw 2004 in there too.) It’s meaningless to say that a forecasting method predicts the winner correctly (or incorrectly) in these cases. And from a statistical point of view, you don’t want to adapt your model to fit these tossups–it’s just an invitation to overfitting.

To put it another way: suppose his method mispredicted 1960, 1968, and 1976. Would I think any less of this method? No. A method that predicts vote share (such as used by political scientists) could get credit from these close elections by predicting the vote share with high accuracy. Again, I see virtue in the simplicity of Lichtman’s method, but let’s be careful in how to evaluate it.