Monday, July 28, 2008

Stats 101 for Journalists--Correlation vs. Causation

While perusing your favorite newspaper, you may have run across an all-caps, bold-print headline with a title something like this: EATING SPINACH EVERY DAY WILL PREVENT CANCER, DOCS SAY. Generally, these articles will include speculation from researchers on how exactly this miracle food will keep you cancer-free; perhaps its those antioxidants, perhaps its high fiber content. Whatever the reasoning, the implication is that you should run out immediately to the grocery store and commence a daily diet reminiscent of a rabbit’s.

Not that there’s anything wrong with spinach. Your mom and popeye were right, spinach is very good for you, in fact. And hey, maybe it is a key part of a diet that will aid in preventing cancer.

The problem is, in fact, a statistical one. A common statistical error that perennially causes stats profs to tear out their hair in frustration, or perhaps, if they’re old and jaded, to merely roll their eyes and shrug.

The problem is the inference of causality from correlation.

Most likely, the study had a design something like this: hundreds of people were followed throughout a number of years, and periodic surveys were sent to them asking them about their diets. They filled out the form stating how many times a week they ate certain foods, and sent it into the study center. Or maybe they got phone calls from research assistants, asking the same questions. But regardless, it was not an experimental study—that is, nobody put these hundreds of people in cages and gave them different kinds of diets, each with different amounts of certain foods. It was observational, meaning that the researchers worked with what they could get—the pre-existing diets of their study volunteers, over which they had no control. Instead of creating and administering different conditions to them, they scientists just watched the subjects and saw what happened. Their data allowed them to correlate a factor with an outcome, but not prove causation.

This may seem like an academic difference but it has far-reaching implications. In experimental conditions—say, working with mice in a laboratory—all of the conditions are carefully controlled, in order that any effects can be attributed exactly to a cause. Say that ethics regulations allowed scientists to put people in cages and experiment on them to see the effects of spinach on cancer. Every person would receive the exact same cage conditions: exact same lighting, medical treatment, air temperature, amount and type of exercise, etc etc etc. And they would receive the exact same diet—except for one key difference. Half of the caged experimental humans would receive a diet that had more spinach than the other group’s. Then, after many years of monitoring under these same conditions, if there were any difference in cancer rates between the two groups, this could be attributed exactly to the one difference that existed between the groups—that of spinach consumption. The only way to infer causality is through experimentation—manipulating conditions in a controlled manner to see what affects these differences have between groups.

Obviously, because of ethical and monetary restrictions, this kind of study design with humans is impossible. So why can’t you infer causality from observational studies—the type of survey study that was carried out to create the flashy newspaper headline? The problem is that nothing is controlled in the research subjects—you don’t know if they have the same conditions at home, the same income level, the same amount of exercise, the same anything. What if it is not the spinach that is causing some people to have lower rates of cancer, but something else, that happens to be associated somehow, coincidentally, with spinach consumption? Fresh vegetables are expensive. They also require more time, generally to prepare—to wash, cut, etc. What if its not the fact that the cancer-less people are eating spinach, it’s that they can afford to have more fresh vegetables in their diet because they are wealthier, and maybe their extra wealth allows them to see the doctor more frequently? Or what if the extra little bit of time they have in their day that allows them the time to prepare fresh vegetables like spinach also happens to be enough extra time to go jogging as well? Any number of other, hidden, things could be the actual cause, or one of many causes of the lowered cancer rates in these people. The spinach may have nothing to do with it; it may just have been associated somehow with the actual, unrecorded cause.

Experimental studies, which allow true inference of causality, are impossible in many cases when the study animals are human beings. The best correlational studies looking at human habits and disease outcome over many years have huge numbers of people and try to get as much information about their participants as possible—background health info, income, marital status, exercise habits, etc, in order to take all these factors into account. And they often find very interesting and useful results, linking certain types of diets, lifestyles, or exercise habits to long-term rates of disease. But no matter how well these studies are designed and carried out, no newspaper can ever report on their findings using the word “cause.” Even if they record as many different variables as they can think of from their study participants—exercise, religious beliefs, geneology, length of their little toe, etc, it’s impossible to know whether or not they recorded any information about the factor that is truly causing the differences seen in the study. The conditions and the participants themselves are just too variable. To talk about causation in this context is simply incorrect, and perhaps even false.

What’s then the use of these large-scale observational survey studies? These studies are useful in finding links to diseases, which can then be studied directly in an controlled experiment using mice—which are 80-some-percent genetically related to us. Once this same connection is found in a controlled, experimental environment, one can finally come to some conclusion about causation.

I had a stats prof who had written his master’s thesis on biologists’ understanding of statistics. He found that over 70% of research published over several years in a peer-reviewed biological journal had statistical errors. It’s no surprise then, that newspaper writers are prone to the same kinds of statistical mistakes. It’s then up to the discerning reader to look beyond the headline, dig a little deeper, and figure out if the research was carried out in such a way as to merit the flashy headline. Dramatic words like “causes” and “leads to” and even just “will” sell newspapers. But they may not be statistically and scientifically accurate—be smart and judge for yourself.

1 comment:

Anonymous said...

Great post with relevant example. An even more confusing issue, at least for me, is correlation vs. causation vs. conditional (if..then). What is the difference between these 3?

My thoughts: I always thought of a conditional as 100% correlation. Example, after saying "eating more spinach is correlated with lower risk of cancer", it could still be true that one eats more spinach and still does not have lower risk of cancer (data point could be an outlier). However, saying "If you eat spinach, then you'll have a lower risk of cancer." In this case, it'd be inconsistent with the immediately above statement for one to eat more spinach and not have lower risk of cancer. Finally, "Eating more spinach causes lower risk of cancer" is a conditional that establishes causal relationship. This is where issue gets confusing; does it always cause lower risk of cancer? If yes, then for this to be true the correlation has to be perfect (and in real life I've never seen a perfect correlation) since correlation is a necessary for causation. If no, then what percent of the time does it cause lower risk of cancer (is this percentage equal to the R-squared of the correlation)?
Is my assessment correct?

Also, when an article headlines, "More spinach, lower risk of of cancer." What kind of relationship are they imply? Is it:
1) correlation, where more spinach sometimes yes and sometime not lower risk of cancer?
2) conditional - where more spinach always lower risk of cancer
3) causation (not 100%) where sometimes it causes and sometime it doesn't cause lower risk of cancer.
4) causation (100%) where more spinach always causes lower risk of cancer.
How is a reader supposed to know?