Wednesday, April 26, 2017

Income Inequality Revisited

I wrote a couple of weeks ago about income inequality but now I'm digging deeper into 2017 national data to see just how my town, Athens-Clarke, looks. The picture ain't pretty, but it can be explained. Here are the top counties nationally in terms of income inequality.

Greatest Income Inequality
  1. Radford City, Virginia
  2. New York, New York
  3. Clarke County, Alabama
  4. Twiggs County, Georgia
  5. Terrell County, Texas
  6. Watauga County, North Carolina
  7. Oktibbeha County, Mississippi
  8. Rolette County, North Dakota
  9. Suffolk county, Massachusetts
  10. Carroll County, Mississippi
  11. Natchitoches Parish, Louisiana
  12. Orleans Parish, Louisiana
  13. Clarke County, Georgia
I extended the list to #13 for the obvious reason, to make a point and show how lousy Athens-Clarke does in the national rankings. Think about it. We're #13 out of over 3,000 counties

So what is income inequality? By the data here, it's the proportion of the 80th percentile median income to the 20th percentile median income. (80th percentile means you make more than 80 percent of all other residents) So if you have really high and really low salaries in a county, you get a higher inequality statistic. New York has an 8.8, meaning the 80th percentile median salary is nearly nine times greater than the 20th percentile median salary. Athens-Clarke comes in at 7.4. 

Fun fact -- the least inequality in the U.S. can also be found in Georgia (Chattahoochee, at 2.7). Go figure. 

Blame College Kids?

To some degree the Athens-Clarke data may be skewed by college students, some of whom fill out census or other information as living locally and often have little if any significant income. After all, 29.9 percent of Clarke County's population is age 20-29 -- that's double the Georgia and U.S. rate. I looked at Census tract 4.02, which is around campus with 7,761 people, to see if students are indeed skewing the data and 86 percent are listed as below the poverty line. The median age is 19.2. Still think students aren't messing with the data?

More evidence. Look at some other college towns high in the rankings. I plucked a few out for a closer look. Keep in mind we're talking over 3,000 counties nationwide.

  • Lee County, Alabama, home of Auburn University, is ranked 69th nationally
  • Alachua County, Florida, home of the University of Florida, is ranked 74th
  • Orange County, North Carolina, home of UNC, is 131st
  • Boulder County, Colorado, home of University of Colorado, is 412th
  • Richland County, South Carolina, home of University of South Carolina, is 1076th
So in just looking at a few nearby examples, we see several college towns rank higher in income inequality than you might expect (Bama, UF, UNC), but a few are not so very high (Colorado, S.Carolina). 

Poverty ratings can be misleading for college towns, and often it's better to examine the percent of children under the age of 18 living in poverty, which will largely exclude college students. For example, the Census has 36.6 percent living in poverty in Athens-Clarke and a different data set has it at 38.1 percent. If we look at under 18 it's 39.4 percent, somewhat higher. Even more dramatic, 60.6 percent of a single female households with kids are below the poverty level. 

If I had time I'd download the raw census data and compute an age breakdown, but duty calls elsewhere. I'm willing to bet the income inequality numbers are boosted by the presence of college students. To what degree is unclear, and the presence of college students willing to work at low-paying service jobs can suppress the incomes for less educated folks in the county, so in some ways this is a double attack on income equality, or lack thereof.
 







Tuesday, April 25, 2017

Food and Obesity

Access to healthy food and obesity are often linked. Basically, the most obese counties tend to be the poorest and where people rely more heavily on fast food. It's cheaper to eat poorly than it is to eat healthy.

So let's look at Georgia and how the 159 counties rank in terms of available food quality and obesity and see if it holds. Look at the table below. In the left column I've ranked the Top 10 worst in terms of "food environment," a measure of access to healthy food. The right column reflect each county's rank in terms of obesity out of all 159 Georgia counties. Discussion below the table.

Ranked Worst in
Food Environment

Rank out of 159
In Obesity Measure

Taliaferro
82nd
Baker
108th
Dougherty
26th
Marion
52nd
Hancock
88th
Quitman
32nd
Crisp
90th
Clarke
157th
Macon
4th
Clayton
1st

A couple of things jump right out. Clayton County is 10th in lousy access to healthy food and 1st in terms of obesity, and most of the Top Ten food counties are in the upper ranks of obesity, the exception being Clarke County, where I live and home of UGA. I suspect students are affecting the results here, both in lots of fast food and lots of fit people (sigh ... I'm always a year older, they're always 20).

For the statisticians out there, the correlation between lousy food and obesity is .38, which is a moderately strong but not overwhelming relationship. Poverty, of course, is a huge predictor of food environment and the best food environment counties are, no surprise, the least poor. There's a .40 correlation between obesity and income in the data.

With more time and motivation it'd be fun to run a regression and control for lots of factors and see which ones truly pop out as significant.

Monday, April 24, 2017

You Cheaters You

I'm scouting out the data available in an early release of the ANES 2016 surveys and in particular I've looked to see what questions may have a mode effect -- that is, when the survey is done one way versus another, in this case a random national sample of folks who were interviewed face-to-face (F2F) and a random sample that completed the questionnaire online.

Interesting differences do occur. I wrote about some the other day.

Today let's look at how well people answered political knowledge questions, F2F versus online. The four below ask respondents to identify the holders of four offices. I think we have some cheaters here, folks who went and looked up the answers when completing the online questionnaire. My evidence? You be the judge. In the table below I provide the percent correct to the questions using what's called scheme 1 (some folks were randomly assigned to scheme 1, others scheme 2. The results were similar and I didn't want to take the time to combine the groups).


Percent Correct

F2F
Online
Speaker of House
43.3
57.7
German Chancellor
26.1
48.3
Russian President
79.9
84.7
U.S. Chief Justice
  7.4
33.6

Clearly the online folks were either much smarter, which is unlikely as people were randomly assigned to either the F2F or online group, or some of them took the time to go look up the answer. Tsk tsk tsk. Cheaters cheaters cheaters. How else can we explain 7.4 percent of the F2F getting a question right and 33.6 percent getting it right in the online group?

Which makes for an interesting methodology question. How do you handle this if you're studying political knowledge? Instead, there might be an interesting study here in who cheats, if only we knew whether they did or not, but with some massaging of the data we might get a clue. Will look harder at this over summer.





Thursday, April 20, 2017

Survey Says ...

There's a survey-based story in today's The Red & Black about how students pay for college. Good idea for a story and it includes lots of useful information and interviews, so I'm not nitpicking the story so much as I am the survey itself, which is what I do given I also teach Grady's graduate public opinion class. Here's a key graf:
The Red & Black completed a survey in which 100 UGA students shared their own encounters with payments throughout college and found over half of respondents have their rent and utilities paid for by their parents.
First let's go with the information provided. A sample size of 100 means the margin of error is 10 percent. That means the 49.5 percent who do not consider themselves financially independent could actually be between 39.5 and 59.5 percent, and a 10 percent MOE means some of the results are actually statistical ties. And that assumes it's a random sample, which is the only kind in which you can truly apply a margin of error. We don't get a lot of methodological detail here. How was the survey conducted? When? How were respondents selected? Is this a convenience sample? A SLOP?

In fairness to the audience you should make clear in a sentence or two how the survey was conducted, and if it's non-scientific, say so high in the story so the reader can approach it with healthy skepticism.  If that's in the story and I missed it, please let me know.

Finally, a journalism point. A survey of 100 is really more of a man-on-the-street approach, but on steroids given we often interview at most six people in that sort of thing.






Monday, April 17, 2017

Face-To-Face vs Online Surveys

The general idea is controversial or sensitive survey questions get very different results if asked to respondents face-to-face or on the phone versus a more impersonal approach, such as online. Let's look to see if that's the case using fresh ANES 2016 election data and whether a survey was online versus face-to-face.

  • Is Obama a Muslim? Percent who say Yes:
    • F2F: 31.1%
    • Online: 30.3%
  • Voted for Trump
    • F2F: 41.3%
    • Online: 39.2%

OK, let's look at these first two above. Clearly the Obama Muslim question has no mode effect, in other words asking it face-to-face versus online makes no difference given it's 31.1 versus 30.3 percent thinking he's Muslim. On voting for Trump there's no real difference either. We're talking a couple of percentage points, nothing significant or at least not substantive. OK, let's try another.

  • Should transgender people have to use their birth sex bathroom?
    • F2F: 50.2%
    • Online: 52.6%
Slight difference above, but again it's slim, too slim to have any real meaning. How about something more mundane.
  • Do you attend religious services?
    • F2F: 64.5%
    • Online: 58.2%
There's something going on in the question above, one that fits theory. Usually a socially desirable response (attending church, etc.) gets more "yes" responses in a phone or face-to-face survey, and that's the case here by several percentage points, enough that I'd argue survey mode matters. Here's another that's kinda interesting below:
  • Favor a wall on Mexican border
    • F2F: 29.5%
    • Online: 33.4%
That's a decent spread above, enough that I'd argue we have a small mode effect. Respondents were a little more willing to favor building a wall in the online group versus the face-to-face group.

I can do this all day, but I'm running out of time. As we can see above, there are mode effects and they do matter, but sometimes they don't matter at all.

Wednesday, April 12, 2017

In My Mail

I got some lovely mail the other day, a big white envelope that said:

DO NOT DESTROY
OFFICIAL DOCUMENT

It's tax time and there's nothing scarier than official documents that you should not destroy. I opened it to find a 2017 Congressional District Census and a questionnaire that is, of course, designed to measure objective opinions on the issues of our time. Here are some of the questions:

  • The first asks how I identify myself politically. The choices are conservative Republican, Independent voter who leans Republican, Democrat, Moderate Republican, Liberal Republican, or Tea Party Member. Notice there's one choice for Democrats but all kinds of flavors for Republicans.
  • Media use is great. They offer all kinds of possible responses to how I regularly receive my political news. So, for example, there's NBC/CBS/ABC as one choice. That's okay, the three broadcast networks lumped together. As someone who seriously researches this stuff I can live with that. But then there's CNN/MSNBC together, which is silly. FOX News gets it own category, as do newspapers, radio, blogs, etc. And if wouldn't be Republicans if they didn't put a category called, and I'm not making this up, "Social Network." Not Facebook or Twitter, not social networks, but in all caps and singular, as if there's only one.
  • On what five issues I think should immediately be acted on, every single one of them is written toward a conservative response. My favorite? "Cancel unconstitutional executive orders issued by Barack Obama." Clearly irony escapes these folks, given present circumstances.
  • Here a question that's not leading at all -- "Do you agree that President Donald Trump and our Republican leaders in Congress should be aggressive in working to pass legislation to create jobs, cut taxes and regulations, end economic uncertainty and make America more competitive?" Really? That's a survey question?
  • Here's another winner: "The Democrats' fixation on "climate change" has led to costly regulations that are negatively impacting our nation's economy. Do you think climate change is a major threat to our nation?" Ya know, let's put that "climate change" in quotes. Because why not?
  • There is some good stuff here, like whether I support sending ground troops to Iraq and Syria, but most of them are complete bullshit.
Lemme be clear, this is not a real survey, it's really a way to raise money. They ask all these questions and when you finally get to the bottom there's a way to "better deliver our message as we fight to Make America Great Again."


Tuesday, April 4, 2017

Mode and Vote Exectations

In addition to asking presidential candidate preference, we also often ask survey respondents to predict the election outcome. Indeed, there's some evidence that the second question is more accurate than the first, at least in terms of gauging an electoral result. Obviously that didn't happen this past presidential election year -- both preference and expectation called it wrong.

A lot of what we're looking at, when it comes to Donald Trump, is whether survey mode (face-to-face, phone, online, etc.) affects his results. The hypothesis is that on the phone or face-to-face, respondents are a little less willing to voice their Trump support. That's the hypothesis, but Pew just published a big thing on this and found no mode effect. Here I'm looking at predictions of who will win and survey mode, based on freshly released 2016 ANES data. Caveat -- this is an early, advanced release, and a cleaner version of the data will be released soon.

Here's what I've got so far, hacking away between classes.

Using weighted data, we find that more people anticipated a Hillary Clinton win than Trump. Overall, in weighted data, 61.3 percent predicted Clinton would win, 34.8 percent predicted Trump would win (the rest are scattered across "other" or refused and so on).

OK, how about mode?

In this case we're comparing face-to-face surveys with a web-based (online) surveys. Glancing at the results, I don't see much of a mode effect on predicting the winner. Turns out, in face-to-face and web-based, the same number of people predicted Trump would win (34.8 percent). Clinton had slightly higher in face-to-face (64.3 percent) than online (60.3 percent), but that's not all that big a gap.

Simply put, survey mode made no substantive difference in who people predicted would win the 2016 presidential election. What's fascinating, at least to me, is best I can tell this is the largest "miss" in this ANES question going all the way back to 1952. I'll write more on that another day when I can dig up my data from 1952 to 2012.