Friday, August 14, 2009

A Specificity Index
for Open-Ended Coding

I blogged yesterday (see below) about problems in coding open-ended responses to survey political knowledge questions. I used the Nancy Pelosi question as an example. The "correct" response, from a scholarly standpoint, would have respondents identify her as Speaker of the House, but I also argued that it was equally correct to identify her as a congresswoman, a member of the House, and a lot of other answers that in the past would have been coded as "incorrect" in the American National Election Studies dataset.

Go back to yesterday's post for links to ANES data, the newly released raw open-ended responses, and other important points of interest, especially problems with earlier coding.

We don't know exactly how ANES staff will code these answers, but when release the next version of the 2008 pre- and post-election data, I'll do a comparison then. Today I offer an alternative Specificity approach to coding. It's simple. Anything resembling "Speaker of the House," given its specificity, gets coded as the highest, most correct, response. Let's call it a "3" for the sake of argument. Identifying Pelosi as a member of the House, while correct, loses that specificity, so it gets a "2." Calling her a politician or something similar, that's correct in a vague sort of way, so it scores a "1." And getting it wrong, that's a "0."

Missing and refusals get their own special codes. Scholarly typically recode a refusal to be the same as an "incorrect" response. That's a different problem for a different day.

My specificity method provides greater data range. If someone doesn't like it, they can collapse the resulting codes into any method that strikes them as useful, especially if they're comparing answers in 2008 with some previous year.

No comments: