So here’s a statistical question for you. Now, mind you, this is a question that ground my graduate-level experimental design class to a halt for at least half an hour, and that also brought my boyfriend and I into something resembling not a discussion, but an actual argument. Pretty heavy stuff, for statistics.
Here’s the situation: You are a botanist and you want to study the effect of two different light regimes on petunia growth—let’s say 12h light and 12 dark, and 18h light and 6h dark. You have 40 little petunia seeds planted in pots waiting for you, and your university has 2 environmental chambers for you to use. You put 20 pots in each chamber, set the light timers, and start the experiment. After the prescribed number of days on this program, you measure all your petunias, and begin to analyze the data.
Now here’s the question, and I’ll phrase it a couple different ways: How many experimental units do you have? In other words, how many independent data points? How many degrees of freedom will you get in an analysis of these data?*
Answer: 2 experimental units, 2 independent data points. 0 degrees of freedom.
“What???!!” you may splutter. “But there were 40 petunias!!” You thought there were going to be 40 experimental units and 19 degrees of freedom, didn’t you?
Well, what did happen to those pots of petunias? The problem stems from when they were all put into only 2 environmental chambers. Once in an environmental chamber, the light turned on and shone into the entire room. The light was applied to all the pots together, as a group. If the lightbulb, say, started flickering and going out in one room, it would be flickering over all those plants together. In other words, the treatment (the light) was applied to one unit, the room. Therefore the environmental chamber as a whole becomes the unit of experimentation, not the individual plants. If the experimenter were to ignore this, he would be committing the mortal, yet frighteningly common sin of pseudoreplication.
Pseudoreplication occurs when there is a lack of independence between supposed experimental units, when the treatment is applied collectively, not individually. What this means, practically, is that each of your little units you are assuming are independent are actually irrevocably linked to each other, in a way that can mask the effect that you’re actually trying to see.
Now let’s suppose that there’s a problem with one of the lights, the one in the chamber on the 12/12 regime. That light tends to flicker when the MRI machine next door gets turned on. It’s new, and people don’t generally tend to hang out in environmental chambers and read the Times and have a coffee, so it hasn’t been noticed yet. But every single petunia in that chamber collectively feels all of those light flickers. In fact it happens frequently enough that it negatively affects their growth—all of them, together. So when the data are collected, the plants in the 12/12 room are just a little bit shorter than they might have been otherwise. Their growth was stunted just enough that those plants’ heights are less than those of the plants in the 18/6 room. When the experimenter analyzes those data (not realizing yet that the experiment is pseudoreplicated), he finds a significant difference between the two and concludes that an 18/6 light regime for petunias helps them grow taller. What he doesn’t realize is that he hasn’t detected a difference due to light regime, he’s detected a difference due to faulty wiring—not at all helpful. Incidentally, even if he did finally recognize the pseudoreplication, he wouldn’t be able to analyze the results. With only one (true) experimental unit in each light regime, he wouldn’t be able to take an average and compute the variation around that average—there’s no variation because with only one data point, there’s nothing to vary. Without that, he can’t figure out if his two treatments truly are different from each other outside of the range of normal background variation. No conclusions can be made, and the entire study is wasted.
It’s easy to imagine other situations in which the pseudoreplicated nature of this study could screw up the results: a careless undergraduate props the door to one of the chambers open for a minute, forgets about it when his girlfriend calls, and then goes to lunch. In the meantime that room loses all its humidity through the open door. Or one of the lightbulbs burns out and nobody notices it for 8 hours. Et cetera.
Experimental units are independent when treatments are applied to each one individually. If this study used little light lamps for each plant, they would truly be the independent experimental units, because each one would be receiving an independent treatment. If one of the bulbs flickered and screwed up the growth of that one plant, the results overall may not be affected much, because there’s still 19 other independent data points in each that will all be averaged with the screwed-up one. Not ideal, but not the end of the world. Doing it this way sounds like a lot more work, but sometimes correct experimental design calls for a little more creativity and effort in order to get it right, and get valid results.
You roll your eyes and tell me that I’m being entirely impractical and unrealistic. “OK let’s assume that this is a well-funded university that can afford a decent electrician. Everything in the rooms has been tested and checked out. They’re fine. They’re completely monitored in every way so that if something goes wrong it’ll be noticed immediately and fixed. Stop being such a curmudgeon.” Yes, probably everything will be fine. But what if there is some variation that you don’t know about yet? You can’t monitor something you don’t know of. You have to design your study well enough, and with all precautions in place, to take care even of the most unforeseen circumstances. Only then can you get results that prove what you say they prove, with as much confidence as you think.
Pseudoreplication is everywhere. For example, a major study in my thesis area is pseudoreplicated, and sometimes I wonder if I’m the only one who’s noticed. (A developmental hormone was applied to some insects. The experimenters squirted hormone onto filter paper in the bottom of a Petri dish, and let groups of insects walk around on it and absorb it. Thus, the experimental unit here was not the insect, It was the Petri dish. But you can bet that each insect was treated as independent in the statistical analysis.) I know that sometimes I tend to lazily skim over the methods sections in papers to get to the conclusions. It’s a temptation, and a strong one, too, when there’s so much to read and so much else to do. But so much can go wrong in those dry methods sections. If we biologists can’t be trusted to always remember the lessons of our statistics classes way back in grad school, then all of us have to be on guard to catch our colleagues’ mistakes, before those unnoticed mistakes become accepted and cited in future research, even though they may well be completely erroneous.
*”data” is plural. “These data.” Not “this data.” Really. Don’t be That Guy**
**In normal situations “That Guy” might refer to the dude at the bar with his shirt tucked into his underwear who can’t figure out why all the girls are shooting him down. In nerd circles, it refers to the person who uses “data” as a singular noun. Hopefully it’s not the same person who also has his shirt tucked into his underwear, or he’ll never get a date.
Monday, August 4, 2008
Subscribe to:
Post Comments (Atom)
3 comments:
Please let me know what position Brice took so I be sure to take the opposite one.
You're standing on a pretty slippery slope. Say you have lab rats and you stick them all with the same needle? OK, can't do that. So you get different needles but you draw your control solution from the same bottle of saline? Wait, OK, so we use different bottles of saline, better yet let's make our own saline. Hmm, can't use all tap water out of same faucet for , let's use bottled water. Etc.
Anyway, it's about balance I guess. Independent replication is a good double check for that kind of problem too.
Yeah, I agree, it can get a little slippery. But I think in your example one rat does fit the definition of "independent experimental unit": it's the unit to which the treatment is applied. One rat gets one injection. (Unlike the petunia example where one room gets one lightbulb.) I see your point though: if they're all receiving the same solution from the same bottle, are they truly independent? I think at some point some reasonable assumptions have to be made. Otherwise things just get a little ridiculous. For example, in my petunia example again, you could argue, "Well, even if they each have their own light source, all those light sources are hooked up to the same circuit, so they're not independent after all!" A little ridiculous, right? I would argue that once you get rid of egregious pseudoreplication, other existing problems of independence can either be dealt with by blocking, or with a decision that that particular problem is just too minor (like the circuit thing or perhaps the saline bottle thing).
But I think too few experimenters truly do not consider problems of independence seriously enough, in general. In fact, they often go completely unrecognized. All I want is for people to think about these things before they throw an experiment together.
PS. Brice was all the stuff in quotes, more or less. Minus perhaps the cartoonish spluttering.
Post a Comment