Friday, February 27, 2015

Polls and the Factually Challenged

The Republican presidential race is the interesting one. I got to wondering how well a candidate was doing in the polls compared to how factually challenged that candidate happens to be. So I took data from PolitiFact's "personality" section and counted up the total number of evaluations and calculated two percentage scores for the key GOP candidates. These scores were:
  • Percentage False -- essentially, the percent of all evaluations that were judged "Mostly False," "False," or the ever-popular "Pants on Fire."
  • Pants on Fire -- the percent of all evaluations that were judged as the worst, the "Pants on Fire."
I compared these to recent polls conducted in Iowa, New Hampshire, and South Carolina (links to the polls here). After all, it's useful to know what states prefer the more factually challenged and, as a consequence, which should be voted off democracy island.

Before we get to the results, a coupla caveats. Two possible candidates, Ben Carson and Donald Trump, were excluded. Carson I excluded because he's never had any statements evaluated by PolitiFact, making it hard to score him. Trump I excluded because he's a clown, but also because his name wasn't used in any polls.  But mostly because he's a clown. And as a final caveat, these are statements judged in large part because they were so out there, so it's not really a measure of a candidate's honestly but more a measure of his or her likelihood to say stupid things.

Basic Results

Rick Perry had the most statements evaluated (158, as of 8 a.m. February 27, 2015), followed by Scott Walker (126). Lindsey Graham had the fewest (9). The table below shows the candidates, the percent of false statements, and the percent of "pants on fire" statements. As you can see, in terms of total falsity Ted Cruz holds a reasonably comfortable lead over Rick Santorum, and Scott Walker and Mike Huckabee are tied in their ability to say false things. Also Huckabee leads the GOP pack in terms of having his pants on fire, followed by Rick Perry. Everyone after that is in single digits.

Name %False %PantsFire
Ted Cruz 64.3 9.5
Rick Santorum 53.8 9.6
Scott Walker 48.4 7.9
Mike Huckabee 48.4 12.9
Rick Perry 46.8 11.4
Marco Rubio 38.6 2.4
Rand Paul 33.3 6.1
Chris Christie 31.5 7.6
Jeb Bush 27.3 4.5
Lindsey Graham 11.1 0

(As an aside, 69.2 percent of all Trump statements were some form of false, and he led by far in terms of percent of statements earning a "Pants on Fire" judgment (30.8 percent ... wow).

Polls and Falsity

So how does this compare to the individual state poll numbers? Not well. First, for the statistically inclined, some correlations -- basically the measure of how good a relationship exists, beweeen -1.0 (perfectly negative) to 1.0 (perfectly positive). For example, In Iowa, the correlation between the poll rankings and percent of false statements is a paltry .04, which is damned close to being zero. Luckily the percent of "pants on fire" judgements comes to the rescue, with a correlation of .30. What's that mean? The more likely a candidate was to make really really factually incorrect statements, the better he did in the Iowa poll. Huckabee drives this relationship as he leads in "pants on fire" and polls in first place in Iowa.

We don't see much in New Hampshire in terms of relationships, so let's skip to South Carolina. In South Carolina, there is a -.73 relationship between percent of false statements and how well a candidate is doing in the polls, and a -49 correlation on the "pants on fire" measure. In other words, the better you did on the polls, the lower your percentage of factually wrong statements.
So, are South Carolinians less forgiving of the factually challenged? Perhaps. More likely it's the favorite son status of Lindsey Graham skewing the data. To test this, I excluded him from the analysis. The relationships remained negative, but not as strong (-.55 on all false, -.18 on "pants on fire").

The graphic below gives you a visual display.

Percentage False (x-axis) by Support in South Carolina (y-axis)

So what can we take from all this, other than I need a better hobby? The PolitiFact data seems to be a lousy (so far) predictor of popularity among early GOP voters. I frankly expected more. I figured the candidates more likely to toss red meat out to the early voters would be more popular, and that would be reflected in the "pants on fire" or false evaluations. Of course the PolitiFact data relies on which statements the PolitiFact staff choose to examine, and it's still very early in the primary season, so this is probably an analysis better done later in the year.

No comments: