Sifted Matcha Powder

Sifting Data to Find Desired Results

“Those among us who are unwilling to expose their ideas to the hazard of refutation do not take part in the scientific game.” Thus wrote Karl Popper in 1934. But these lofty words don’t protect us from the hazard of confirmation bias. It really hurts when a big, expensive trial does not confirm an important idea. So it’s very tempting to sift through the data to find desired results. This is confirmation bias at work.

A new study in the American Journal of Public Health provides a perfect case in point.

The NET-Works RCT

Social Ecological Model for Obesity PreventionThis was a big study – a sample of 534 parent-child pairs followed for three years. The children were toddlers between two and four years of age from ethnically diverse, low-income families. These toddlers were at or above the 50th percentile of BMI for age and gender. It took five years to complete this randomized, controlled study. But many years of work laid the groundwork for it.

The NET-Works intervention  applied a social ecological model for obesity prevention. That means it linked homes with primary care and community resources. Families received home visits, parenting classes, and telephone check-ins. Make no mistake, this was an ambitious program and an ambitious study.

The full name says it all: Now Everybody Together for Amazing and Healthful Kids.

The Primary Outcome

The goal was simple and straightforward. Apply this model of education and support for parents of toddlers and test the effect on the toddlers’ BMI. The pre-specified primary outcome was BMI at 24 and 36 months. Would this intervention result in lower BMIs? That was the question.

Unexpected Results

Well, the answer was not what the researchers wanted. After all that work they found that:

Compared with usual care, the NET-Works intervention showed no significant difference in BMI change at 24 or 36 months.

But the authors’ conclusions did not line up with that finding:

In secondary analyses, NET-Works significantly reduced BMI over 3 years among Hispanic children and children with baseline overweight or obesity.

In other words, it didn’t work as planned, but with additional analyses, they did find an effect in a subset of children. These were Hispanic children and children already in the range of overweight or obesity – roughly half of the sample in the study.

This study started as a prevention study in a diverse population of toddlers at risk for weight gain. But the program didn’t work for primary prevention in the whole study population. So researchers sifted through the data and found a subset of children who did well in the treatment group. That turned out to be Hispanic children and children with overweight and obesity.

Learning from Surprises – Or Not

The reason for doing an experiment is to learn. Did the intervention work as intended? This one did not.

But that’s not a failure. It’s an opportunity for learning. Maybe the intervention worked in some subgroup. The secondary analysis found such an effect in a couple of subgroups. Should the authors claim success, based on secondary analysis, as they’ve done in their conclusion?

Writing in the New England Journal of Medicine, Jeffrey Drazen et al offer two bits of wisdom on this question:

A well-designed trial derives its credibility from the inclusion of a prespecified, a priori hypothesis that helps its authors avoid making potentially false positive claims on the basis of an exploratory analysis of the data.

If the primary outcome is negative, positive findings for secondary outcomes are usually considered to be hypothesis-generating.

So what the researchers have here is the basis for new hypotheses. Not for a claim of effectiveness.

Broader Thinking Needed

However, they need to consider a wider range of possibilities. One possibility is that this intervention is not terribly potent. In a companion editorial, Hoelscher et al propose increasing the “dose” and starting earlier – before children are born. Their goal is blindingly clear: “to instill healthy behaviors.”

We recommend thinking far more broadly. Maybe the behaviors they’re seeking to change are as much a result of obesity as they are a cause. Perhaps “instilling” more desirable behaviors can do only so much. In the face of an environment that promotes obesity and the powerful physiology of this disease, maybe we need to do more than just teach coping skills and better behavior.

Maybe we need more complete solutions.

Click here for the study and here for the Hoelscher editorial. For more perspective from Drazen et al on how to think about a negative primary outcome, click here.

Sifted Matcha Powder, photograph © Jigme Datse Rasku / flickr

Subscribe by email to follow the accumulating evidence and observations that shape our view of health, obesity, and policy.


November 24, 2018

3 Responses to “Sifting Data to Find Desired Results”

  1. November 24, 2018 at 8:09 am, Al Lewis said:

    While I can’t comment on this study, I would note it is very common to look for a posteriori “findings” when the main hypothesis failed. In wellness, one study “found” a reduction in cat scratch fever after a wellness program.

    • November 24, 2018 at 10:13 am, Ted said:

      Good perspective, Al. Thanks!

  2. December 06, 2018 at 4:43 pm, Paul Ernsberger said:

    Regression to the mean strikes again! Effectiveness found in those heavier at baseline.