A Cluster Fuss in Obesity Studies
In obesity research, we have a bit of a cluster fuss on our hands. It’s all about a type of randomized study where the randomization is between clusters. This randomization method is important because it’s very useful for obesity prevention studies. For example, you might have children in different schools or different classrooms participating in different prevention programs. Instead of randomizing individuals, the trial randomizes groups of individuals
If you think this sounds a little geeky, pay attention anyway. Because it can have a profound effect on results. It can even lead to a study being retracted because of an inappropriate analysis.
LA Sprouts – Cluster Randomized and Retracted
For a case in point, consider the LA Sprouts study, published in May of 2015 and retracted in December of that same year. The study involved 319 children in four different schools. Two of the schools got the LA Sprouts program and two schools did not. The authors originally claimed their program was effective.
But because of the clustering, that claim didn’t hold up. They had to retract the study. A casual reader might think the study would have good statistical power with 319 subjects. However, that statistical power drops dramatically because those subjects fall into only four clusters. Instead of randomizing 319 students, researchers randomized just four schools.
Many Opportunities for Errors
The errors arising from cluster randomization can range from blatant to subtle. In the most serious form, a study has only two clusters – a test group and a control group. In that case, it’s not even a legitimate cluster-randomized experiment. So at best, it might be a quasi-experimental design.
A slightly less serious error comes from having multiple clusters, but ignoring the clustering in the analysis. Jerome Cornfield famously described this error in 1978:
Randomization by cluster accompanied by an analysis appropriate to randomization by individual is an exercise in self-deception.
Beyond those two egregious errors come more subtle errors. Researchers might acknowledge the clustering in their data but offer a rationale for leaving it out of the analysis. These rationales, when based on something called an ICC, can be quite tricky. Thus, they can lead to incorrect findings.
And then finally, researchers might get everything right in the big picture, conducting a real cluster randomization analysis, but nevertheless make mistakes with statistical details. With cluster randomization, for example, it’s easy to get the degrees of freedom wrong. When you do, it might invalidate the conclusions.
Bottom Line: Be Cautious
As we said at the outset, cluster randomization can be quite valuable. And yet, it can be quite tricky. At the very least, when you look at a study where subjects are clustered into groups, pay attention to the clustering. Don’t let a large number of individuals fool you. How many clusters did they study?
And when in doubt, get the opinion of a real expert in cluster randomized studies. It’s easy to trip over an error.
For more perspective, we recommend this paper by Andrew Brown et al on best practices for cluster randomized studies and this lecture by Michael Oakes. Finally, we gratefully acknowledge generous help in understanding these issues from David Allison, dean of the Indiana University School of Public Health at Bloomington.
Caterpillar Cluster, photograph © Vicki DeLoach / flickr
Subscribe by email to follow the accumulating evidence and observations that shape our view of health, obesity, and policy.
January 29, 2019
February 01, 2019 at 11:19 am, Katherine Flegal said:
I learned about this in grad school in the context of “pigs in a pen.” If you administer some treatment like a certain type of diet to a pen, all the pigs in that pen are eating the same food and thus if you want to examine the impact of the diet then you have to look at it by pen, not by individual pig.