Workplace Wellness Programs Don’t Work Well. Why Some Studies Show Otherwise.

The gold standard of medical research, the randomized controlled trial, has been taking a bit of a beating lately.

An entire issue of the journal Social Science and Medicine was recently devoted to it, with many articles pointing to shortcomings. Others have argued that randomized controlled trials often can’t address the questions that patients and physicians most want answered. I recently wrote about the limitations of the method in studying effectiveness, which is what we care about in real-world situations.

But the randomized controlled trial remains a powerful tool. It’s still, perhaps, the best method for conducting explanatory research. In past articles, I have recounted numerous times when hypotheses from observational studies, those based solely on observations of particular groups, have failed to be confirmed by a controlled trial.

Perhaps the greatest strength of the randomized controlled trial is in combating what’s known as selection bias. That occurs when groups being studied (intervention and control) are already significantly different after they are “selected” to be in the intervention or not. One of the most elegant examples of why we need such trials came recently in an examination of employer-sponsored wellness programs.

These programs are usually offered by employers to make their employees healthier. They can offer screening for a variety of reversible conditions; access to weight-loss programs or gyms; encouragement and support; and sometimes even chronic disease management. Many of the analyses of these programs have shown positive results.

Almost all of those analyses are observational, though. They look at programs in a company and compare people who participate with those who don’t. When those who participate do better, we tend to think that wellness programs are associated with better outcomes. Some of us start to believe they’re causing better outcomes.

The most common concern with such studies is that those who participate are different from those who don’t in ways unrelated to the program itself. Maybe those people participating were already healthier. Maybe they were richer, or didn’t drink too much, or were younger. All of these things could bias the study in some way.

The best of these observational studies try to control for these variables. Even so, we can never be sure that there aren’t unmeasured factors, known as confounders, that are changing the results.

Confounding is a classic problem of selection bias, so perhaps the best way to be sure that the two groups aren’t different is to remove selection from their hands by randomly assigning participation.

But that’s hard to do with a wellness program, especially since they are put into place for the entire company. It’s expensive and requires a lot of time, and there’s no clear funder who might want to sponsor it. That is, until recently.

This year, researchers published results from the Illinois Workplace Wellness Study, a large randomized controlled trial of a wellness program at the University of Illinois at Urbana-Champaign. Almost 5,000 employees volunteered to participate in the study.

More than 1,500 of these were randomly put in the control group, which basically received no services. About 3,300 were invited to receive a biometric health screening and an online health risk assessment. They were then offered a number of wellness activities, including classes on weight loss, exercise, tai chi, smoking cessation, financial wellness and more. They were even offered financial incentives of various amounts for completing screenings and participating in activities.

Of course, not everyone given an opportunity to engage in such activities will take it, but more than half of those offered the program participated. The researchers followed everyone, in both the control and intervention group, for a year to see how the program affected their activities, their health, their productivity and their medical spending.

The results were disappointing. There seemed to be no causal effects.

Here’s the nerdy fun part, though. In addition to this analysis, the researchers also took the time to analyze the data as if it were an observational trial. In other words, they took the 3,300 who were offered the wellness program, then analyzed them the way a typical observational trial would, comparing those who participated with those who didn’t.

The results were very different from those of the controlled trial.

If we look only at the intervention group as an observational trial, it appears that people who didn’t make use of the program went to the campus gym 3.8 days per year, and those who participated in it went 7.4 times per year. Based on that, the program appears to be a success. But when the intervention group is compared with the control group as a randomized controlled trial, the differences disappear. Those in the control group went 5.9 times per year, and those in the intervention group went 5.8 times per year.

Researchers looked at whether people participated in a race, like a marathon, a 10-kilometer run or a five-kilometer run. The observational analysis, comparing nonparticipants with participants, showed a significant difference in running: 3.3 percent of people versus 9.2 percent. The randomized controlled trial, on the other hand, found 6.5 percent versus 6 percent.

Wellness programs sometimes claim to save money by reducing health care spending. The observational analysis supports this belief. It found that participants spent significantly less than nonparticipants on health care ($525 versus $657) and on hospital-related costs ($273 versus $387). The randomized controlled trial showed that the wellness program had little effect on spending compared with the control group in both overall spending ($576 versus $568) and hospital spending ($317 versus $297).

The researchers even looked at the percentage of people who left their job for any reason. In the observational analysis, 15.4 percent of nonparticipants did so compared with only 7.2 percent of participants. It appears from such an analysis that wellness programs are associated with retaining workers. But the randomized controlled trial showed that no such causal link exists, as 12 percent of the control group exited the job, compared with 10.8 percent of the intervention group.

Why such stark differences? “The most likely explanation is that participants differ from nonparticipants in very important ways,” said Julian Reif, one of the study’s principal investigators. “Therefore, when a wellness program is offered, the differences seen between those who take advantage of it and those who don’t are due to differences in the people rather than differences from the program.”

Often, the best we can do for an observational trial is to try to adjust — control, researchers say — for variables we can measure and that might also affect the results. These researchers did. In one analysis, they controlled for sex, age, race, salary and status as faculty or staff. They still found that the results of the observational analysis were significant for all the outcomes discussed above. In an even more heavily controlled analysis, they used machine learning to decide whether to control for even more variables, including (but not limited to) past health, smoking and drinking status; pre-intervention exercise; medication use; and sick days taken.

The differences were unchanged.

“If we had published only these observational analyses, the headline result could have been that even after controlling for a battery of confounding variables, participation in a wellness program was associated with a significant reduction in health care spending, an improvement in exercise, and a lower chance of ceasing employment,” said David Molitor, another principal investigator of the study.

All of that would have led us astray. The wellness program appeared to cause none of those things.

This doesn’t mean that we should never believe the results of observational trials. Many show significant results that stand the test of time. But in every case, we need to weigh the possibility that the people who were studied were selected in a way that made them very different from those who weren’t selected, or the general population. This appears to be the case with those who participate in a wellness program.

Of course, not everything can be studied by controlled trials. Sometimes they are unethical or too expensive. Or the causal effects are too far downstream from the intervention to detect within the length of the study. “It’s important to note our results so far only cover the first year of the program,” said Damon Jones, another investigator of the study. “There is a chance we may see different patterns over the longer term, as we continue to collect data.”

Observational research can be the best way to study population-level effects. In those cases, many researchers are exploring new techniques, like the use of regression discontinuity, a technique to create quasi-randomized groups for analysis.

Still, randomized controlled trials remains much more reliable and useful than other types of studies. In our eagerness to point out their flaws, we shouldn’t overlook their benefits.

This article originally appeared on The Upshot (copyright 2018, The New York Times Company).

Aaron E. Carroll, MD is a healthcare speaker, professor of pediatrics at Indiana University School of Medicine who blogs on health research and policy at The Incidental Economist and makes videos at Healthcare Triage.