Torturing Observational Data to Get a Confession – A Case Study
Sometime in the 1960s, economist Ronald Coase, a Nobel laureate, advised colleagues that torturing a set of data can always yield a confession to serve the purpose at hand. As if to prove this adage, a new publication in the Journal of Clinical Epidemiology shows us 1,208 ways to analyze NHANES data on all-cause mortality and red meat. Yumin Wang and colleagues obtain quite a range of results.
At one extreme, they found an association of red meat consumption with 49% lower mortality risk. The other extreme was a 75% higher risk. About a third of the analyses yielded estimates of higher risk with red meat. Two thirds of them produced results to suggest that eating red meat predicts a lower risk of death.
A Caution
If you think these analyses tell us anything definitive about red meat and the risk of death, think again. Because that was not the point of this work, as the authors explain:
“We acknowledge that NHANES data are likely suboptimal compared to other nutrition datasets for investigating the effect of red meat and other nutritional exposures on health outcomes, due to it including few deaths and only collecting data on diet at a single point in time. Our objective, however, is not to provide answers about the health effects of red meat but to demonstrate a proof-of-concept application of specification curve analysis to nutritional epidemiology.”
So, the real point is to show there are many different ways to analyze an observational data set. This analytical flexibility deserves attention, because it can have a big effect on findings. Torturing data to extract a confession does nothing to advance science. Cardiologist John Mandrola sums it up quite well:
“Humility and the embrace of uncertainty is the best approach to observational science. This study strongly supports that contention.”
Click here for the study by Wang et al and here for Mandrola’s insights.
The Red Cow, painting by Paul Gauguin / WikiArt
Subscribe by email to follow the accumulating evidence and observations that shape our view of health, obesity, and policy.
May 6, 2024
May 06, 2024 at 6:16 am, Joe Gitchell said:
Thank you, Ted–this is certainly illustrative.
And to the point of embracing uncertainty, I have only good things to say about this limited series podcast from Scientific American and hosted by Christie Aschwanden:
https://www.scientificamerican.com/podcast/episode/how-do-we-know-anything-for-certain/
Joe
May 06, 2024 at 8:02 am, David Brown said:
The Grilling the Data article says, “Randomized trials represent the optimal design for investigating the health effects of medical interventions. They pose important challenges, however, when it comes to studying the health effects of food and nutrition.”
Yet the American Heart Association says, “It is a well-documented observation that dietary saturated fats raise plasma LDL-cholesterol levels. However, to reduce cardiovascular risk, it is not enough to simply reduce intake of saturated fats, as this Advisory convincingly puts forward with their exhaustive literature review. The greatest cardiovascular benefits are elicited when dietary saturated fats are replaced with polyunsaturated fats.” https://professional.heart.org/en/science-news/dietary-fats-and-cardiovascular-disease/Commentary
Exhaustive literature reviews do not translate into certainty.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5492028/
May 06, 2024 at 11:51 am, Richard Atkinson said:
Great column Ted. And we wonder why many papers can’t be replicated. It appears that we have too much statistics!