On 538's +7 Perry victory prediction

One of the more annoying things to have happened in the past few years is the fealty shown by journalists to Nate Silver's models. Ooh! Look! Numbers! More than beginner stats! Since most political journalists don't know what a t-statistic is, they simply repeat whatever Silver writes without question. Because it's just so...quantitative. Numbers don't lie, right?

Silver is a pretty good modelmaker who works hard and deserves the success he has had. But a model is only as good as the assumptions that form it and the data that go into it. In some races this year, there's reason to think that his model is pretty good. In the 2010 Perry v White race, there's not much reason to put much confidence in 538's model.

Data that goes into his model
538 is dependent on public polls. Who has polled the Perry v White race publically? Silver doesn't -- and won't, based on his criteria -- include the recent Wilson Research Strategies poll in his rankings, so he's entirely going off of the last couple Rasmussen polls, a PPP poll from mid-June, and a mid-May YouGov internet poll of registered voters. See a problem here?

Of course, Silver doesn't like Rasmussen polls. He uses two separate adjustments to discount Rasmussen polls in his model: one is his pollster ratings (where his assumptions make Rasmussen look slightly less accurate than it really is though Rasmussen still gets a pretty good score) and the other is a special 2010 "house effects" adjustment. What does the latter mean? It means that Silver thinks that Rasmussen is wrong on his turnout model, essentially. Silver is essentially substituting his own judgment of Texas turnout for Rasmussen's judgment. Silver could be correct, but would you rather take an established, prolific pollster with a very good record or a guy who's just started doing political models in the last couple years?

Assumptions that form the model
Silver used polling data since 1998 to make his model. That's a pretty limited sample size in terms of number of elections. While it does include some wave elections, it doesn't include a GOP wave. So he is essentially using data from a different type of election in order to form assumptions. Will pollsters get a GOP wave as accurate as a Dem wave?

Another potential out-of-sample problem for predicting Perry v White is that Perry has already served 2.5 terms. He is running for his 3.5 term. It's not an open-seat election, and it's not your standard re-election.

We're still a long way to go from election day, so lots of things could still change. Probably $20M more in attack ads will hit the Texas airwaves over the next few months. Rick Perry is a very known quantity to Texas voters; Bill White is not. That means that there is much more variability in whether people like Bill White on election day, but there's probably some negative skew and fat tails because of blowup risk (scandal, political ad that defines one of the candidates) that are much more likely to happen to a first-time candidate than an entrenched incumbent. Plus he's a fairly mainstream Democrat running in Republican Texas during one of the most Republican years. Of course, White does have the advantage that anyone undecided right now is undecided despite the fact that they have a relatively firm opinion of Rick Perry.

In short, Silver's prediction of Perry v White is basically just Silver averaging Rasmussen's polls and then sticking a finger on the scales because he doesn't like Rasmussen.

So, Perry +7? I'll take the over. The Texas gubernatorial race is one of the places where Silver's statistical model is on its shakiest theoretical ground.

Posted by Evan @ 09/06/10 06:41 PM


Previous Entry | Home | Next Entry


No comments yet

Add Comments

No flames or impolite behavior. HTML will be stripped. URLs will be transformed into hyperlinks.