CINAC: Correlation is not a Cause

CINAC: Correlation is not a Cause

Sue Blackmore on the one thing everyone should have in their cognitive toolkit that they don’t currently…..CINAC (Correlation is not a Cause).

One reason for this lack is that CINAC can be surprisingly difficult to grasp. I learned just how difficult when teaching experimental design to nurses, physiotherapists and other assorted groups. They usually understood my favourite example: imagine you are watching at a railway station. More and more people arrive until the platform is crowded, and then — hey presto — along comes a train. Did the people cause the train to arrive (A causes B)? Did the train cause the people to arrive (B causes A)? No, they both depended on a railway timetable (C caused both A and B).

And…

The point is that once you greet any new correlation with “CINAC” your imagination is let loose. Once you listen to every new science story Cinacally (which conveniently sounds like “cynically”) you find yourself thinking: OK, if A doesn’t cause B, could B cause A? Could something else cause them both or could they both be the same thing even though they don’t appear to be? What’s going on? Can I imagine other possibilities? Could I test them? Could I find out which is true? Then you can be critical of the science stories you hear. Then you are thinking like a scientist.

Penile Length Leads to Little Economic Growth

Penile Length Leads to Little Economic Growth

Probably not going to use this one in my Stat Lit class, but it is a shame. It’s obviously a good example why identifying probable mechanism is important. Less obviously it’s a great example of cherry picking — if you click through to the paper it is the GDP growth from 1960 to 1985 that is tracked (why that historical segment?) and other indicators show different things (Raw GDP to penile size, for example, demonstrates an “inverted U” pattern, with average penile lengths predicting large GDP, and extremes on either side predicting low GDP).

Zombie Pseudoscience: 8 Glasses a Day

For those wondering, this blog covers two main things — stuff important to instructional design, and stuff important to my statistical literacy class. Here’s something on the latter…

Somewhere, somebody, probably employed by a bottled water company, said we should all get eight glasses of water eight times a day. Coffee doesn’t count, the water in your Hamburger Helper doesn’t count. Beer doesn’t count. It must be water, or something close to it.

That’s not remarkable, that sort of thing happens all the time. 

What’s extraordinary to me is how resilient the myth has been in the face of overwhelming scientific evidence it is nonsense. The BMJ just published an article on this, but back in 2002 an article in the American Journal of Physiology completely debunked it. Snopes looked into the history of the claim in 2005 and found some of the water-as-cure-all garbage came from the self-published 1995 junk science book Your Body’s Many Cries for Water, subtitled “You are not sick, you are thirsty!”

Since 2002 I have seen this claim repeatedly debunked or at least set aside as a not-so-helpful formulation. There is disagreement, but where you look at actual experiments (see here for a summary up to 2002) it’s pretty clear that:

  • People without special circumstances (such as working hot days, taking certain medications, etc) can just drink when they are thirsty and they’ll be fine.
  • The amount people need is highly variable anyway.
  • Caffeinated and alcoholic beverages consumed in moderation count towards your fluid intake, not against it.
  • Did I mention you can just drink when thirsty and you’ll be fine?

Basically the normal person needs to worry as much they are getting enough fluids as they have to worry they are getting enough calories.

So why does the myth stick around? I mean, if we can’t get people to understand that sticking 2 liters of water down your throat no matter what your activity level or food intake is not now, and never has been a requirement for health, what chance do we have with the Global Warming debate, where there are real stakes? 

I don’t really have an answer for that, except to hope that we can get people in the habit of interrogating research without becoming cynical (It’s not all opinion, right? Here’s a case where science just tells us the 8 glasses people are wrong). The fact that the “8 glasses 8 times” was so sticky in our public consciousness shows that we crave quantitative formulations. We just have to be more discriminating about which ones we embrace… 

“The literacy rate among college graduates is lower today than it was 15 or 20 year [sic] ago.”

John Stossel talks to former Tobacco Junk Science Guy Richard Vedder about education:

“Do kids learn anything at Harvard? People at Harvard tell us they do. … They were bright when they entered Harvard, but do … seniors know more than freshman? The literacy rate among college graduates is lower today than it was 15 or 20 year ago. It is kind of hard for people to respond in market fashion when you don’t have full information.”

First, feel free to place the following paragraph from SourceWatch wherever you see the mainstream press quoting Vedder:

Vedder was a member of the Tobacco Institute’s clandestine Economists’ network — a group of academics that the tobacco industry recruited who worked behind the scenes to fight proposed tax increases on cigarettes and the declining acceptability of public and workplace smoking by generating favorable research for publication, presenting favorable papers at academic conferences and symposia, and being ready to challenge the “social costs” economic arguments employed by anti-smoking activist at public and legislative forums. Members of the Institute’s Economists Network also assisted by writing letters-to-the-editor and lecturing to journalists on behalf of the industry.

I suggest the comments box of the Chronicle next time he shows up there. 

I got curious though about the statement about the literacy rate, though, because unlike much of what he says it strikes me as possibly true.

So I looked it up.

The short answer is that it is true, with caveats. Here’s the report he is likely referencing, and here’s the finding on two basic adult literacies (basic stuff, like reading labels or short informational pieces):

 

Changes between 1992 and 2003

  • Less than or some high school
  • Down 9 points in prose
  • High school graduate
  • Down 6 points in prose
  • College graduate
  • Down 11 points in prose and 14 points in document
  • Graduate studies/degree
  • Down 13 points in prose and 17 points in document

Here’s the caveats:

  • 2003 is the most recent year. So saying we know that college students read worse assume that the trend has continued. It may have, but if this is the data he’s relying on, we certainly don’t  *know* that.
  • The report makes no distinction between those that have been out of college for a while, and those that just graduated. One assumes that the shift came out of recent college graduates, but for all we know it could be the result of an aging population. 
  • There’s the whole 11 points issue — this is 11 points out of a total of 500, so your are looking at a percent correct decrease of about 2%. It’s hard to see that as a news item unless it is a steady trend, and again we don’t know, as these figures are 10 years old.

I would not at all be surprised if college turned out to be failing in this regard. The research behind Academically Adrift is a pretty good indicator that we are failing dramatically in helping students attain very basic competencies. But this is not the stat that proves that. 

On the other hand the rest of that report (if you read it carefully) is not an argument that there is too much education, but that there is not nearly enough.

Great base rate fallacy explanation

From (what else?) a debunking of one of Gladwell’s heroes:

In statistics, you can’t judge the predictive oomph of anything without knowing the population prevalence of the event or condition you’re studying. Here’s a simple way to see how easy it is to fall into what they call, in the field, “base-rate neglect”: Suppose you’re told that a man named John is extremely well-educated, smokes a pipe, and wears tweed jackets with patches on the sleeve—is he more likely to be a particle physicist or a janitor? A physicist, you immediately think. But you’d likely be wrong, because janitors are common and particle physicists rare. The chances that you’d happen upon a very well-educated, tweed wearing, pipe-smoking janitor are higher than those that you’d meet a physicist who meets the same profile.

This ends up being a crucial skill in understanding public policy, educational research, personal medical decisions, whatever. And most people get thrown for a loop by it every time. It’s going to be one of the things we cover in our statistical literacy course. 

The classic example, of course, is a medical test. Say the accuracy of a test for Cancer X is 90%. Now say that the prevalence of that form of cancer is 0.5% of the population over 40. And let’s say we test EVERYONE in the population. 

You get back a positive result from the test. Assuming no other information, what’s the chance that you have Cancer X?

Most people think that a positive result means they have a 90% chance of having it. In reality, a positive result in this case means you have a 4.3% chance. 

To understand how this works, consider a population of 2000 above-40 adults. Out of those 2000 people, 10 actually have Cancer X. Nine of those people get positive results, per the test accuracy.

Out of the 1990 other people, 10% of those that don’t have it get mistaken results. That’s 199 people. 

So 208 people get positive results back, 9 of them actually have it. So if you get back a positive result, your chances that you actually have it are 9/208, or about 4%. 

Now say you test only people that have a family history of Cancer X and demonstrate symptom Y. And let’s say the prevalence in that population is 5%. 

The equation goes from 9/208 to 90/208. The test is now over 40% accurate. 

Life or death stuff for those making medical decisions, and crucial for understanding much research. But almost no one knows it. It’s things like this that have got me to delve into teaching statistical literacy.

The Great Threat to Higher Education is Medical Costs, Not Bubbles

To be clear about my last post, there are some catastrophic economics of higher education down the pike; they just aren’t bubbles.

The biggest one? Rising health costs for seniors and the disabled. As health care takes bigger and bigger chunks out of the GDP it is going to crowd out spending on a lot of things, and education is going to be one of the places hardest hit. As Kane and Orzag’s excellent paper predicted in 2003:

Curiously, the biggest challenge casting a shadow on public higher education’s future—the Medicaid program—is not yet on the agenda for most university administrators. The evidence suggests that rapid growth in state Medicaid obligations over the past few decades has crowded out public higher education expenditures, and state Medicaid obligations are expected to continue to grow rapidly over the coming decades. As a result, state support for public higher education is likely to come under increasing pressure, even as state revenues recover. Because roughly three-quarters of all college students in the United States attend public institutions, the implications for the nation’s higher education system are potentially profound.

They go on to calculate that each dollar of Medicaid spending reduces state expenditures on education between six and seven cents. And there is a lot more Medicaid spending on the way.

Incidentally, I’ve seen these problems discussed in literature on European systems as well; they are not problems unique to the U.S. Everybody aware of the coming wave fears it. 

The U.S. has a couple drivers though that make the situation particularly grim — the biggest probably being that U.S. health costs are increasing at a much faster rate, but another factor being that state Medicaid dollars are matched federally, where as state university system dollars are not, which creates an insurmountable pressure to cut education first at the state level.

In any case, I laugh at bubbles. Crowding out, on the other hand, scares me to death.

The “Tuition Bubble” and Degree Oversupply

There’s a lot of neat stuff in Carnevale and Rose’s The Undereducated American  (and if you can’t read the whole thing, the first ten or so pages are essentially a Powerpoint of the findings — they will take you all of two minutes to flip through; you have no excuse).

One of my favorite pieces is on page 29, specifically this graph:

One of the arguments of those who claim we have a tuition bubble is that one in three graduates of college do not hold jobs that require a bachelor’s degree. If there really is such a demand for college graduates, the argument goes, then where are the jobs?

But requiring a degree and putting a premium on a degree are two separate things. If we believe the market is rational in this case then this figure

…indicates just the opposite. It shows the median earnings of full-time, full-year workers in the occupational tiers for those with only a high school diploma and those with a Bachelor’s degree [in those professions where a Bachelor’s degree is not required]. Within each occupational tier, those with Bachelor’s degrees earn between 37 to 45 percent more than those with only high school diplomas.

In other words, employers place a premium on college degrees, even in the lower and middle-skilled jobs that don’t technically require such credentials.

There’s certainly questions about what happens to this gap when you control for a variety of factors — how much of this holds when you control for race, or education level of parents? How does it look comparing within jobs at a more granular level?

But I think at this point the onus is on the Tuition Bubblers to ante up.

U.S. says colleges with big tuition hikes must explain

U.S. says colleges with big tuition hikes must explain

This is almost sadly funny. So there’s all these tuition hikes, particularly at state colleges. It’s out-of-control spending, right? So the DoEd is asking colleges that have the sharpest hikes to explain why they are being so profligate with money.

Except, as everyone knows who actually works at a state college in America, the reason why costs are going up has almost nothing to do with spending. The reason costs are going up is that the state legislatures are cutting the funding to colleges. By a lot. Add in the fact that financial need has gone up as well, and well, that’s pretty much your increase right there.

I guess I’m not opposed to this policy — I’m sure even in the current climate there are colleges that are clearly feathering their nests at the expense of students. And going forward, I am sure it will be useful.

Still, it feels oddly out of touch at this particular moment.

U.S. College Tuition Rises 4.6%, Beating Inflation

U.S. College Tuition Rises 4.6%, Beating Inflation

It pays to read these things carefully. Tuition at private non-profit colleges increased at 4.6%, but adjusted for inflation this was a 1% increase, one of the smallest in the past 40 years. And again, these are published prices: student aid is up 7% which means this is less a story about spiraling college costs, and more a story about private colleges using increased price discrimination. Which could be a fascinating story, if we cared to dig into it.

Incidentally, for those unused to Tumblr, you click the title to get to the story I’m talking about. Yeah, took me a minute too…