For the better half of a decade, American public schools have been part of a grand experiment in "choice architecture" dressed up as simple, practical steps to spur healthy eating. But new research reveals the "Smarter Lunchrooms" program is based largely on junk science.
Smarter Lunchrooms, launched in 2010 with funding from the U.S. Department of Agriculture (USDA), is the brainchild of two scientists at Cornell University: Brian Wansink, director of the school's Food and Brand Lab, and David Just, director of the Cornell Center for Behavioral Economics.
The program is full of "common sense," TED-talk-ready, Malcolm Gladwell-esque insights into how school cafeterias can encourage students to select and eat more nutritious foods. Tactics include things like fruit before chips in cafeteria lines; pre-sliced rather than whole fruit; non-fat white milk prominent in beverage displays; giving fruits and vegetables "creative names;" providing "fruit factoids" on cafeteria white-boards; hiding cafeteria cleaning supplies; and having lunchrooms "branded and decorated in a way that reflects the student body."
This "light touch" is the foundation upon which Wansink, a former executive director of the USDA's Center for Nutrition Policy and Promotion and a drafter of U.S. Dietary Guidelines, has earned ample speaking and consulting gigs and media coverage.
Recently, however, scientists are challenging Wansink's work, pointing out errors and inconsistencies going back decades. For instance...
- The abstract to one of Wansink's published papers claims that schools implementing Smarter Lunchrooms tactics saw a 71 percent increase in apple sales, when the data given in an article table places the actual increase around 4-5 percent.
- One Wansink-led paper purports to address the sustainability of Smarter Lunchrooms interventions—yet uses data from an experiment that lasted just one day.
- One published paper lists three different figures throughout (147, 115, and 113) for the number of participants, which should have all been the same, with no explanation.
A History of Inconsistency
In a new paper—which has not yet been peer-reviewed or published—Eric Robinson, a professor with the University of Liverpool's Institute of Psychology, Health, & Society, points to problems ranging from simple sloppiness to errors that seriously call into question the integrity of all of Wansink's work.
Earlier this year, Nicholas Brown, a PhD student at the University of Groningen, discovered that much of Wansink's work was lifted directly from his previous work without citations or acknowledgement—a practice that's at least frowned upon in academia. And in at least one instance, two Wansink papers that purportedly rely on vastly different data sets yielded almost identical end results, down to decimal points, but with enough slight differences to discount simple clerical error.
"If the two studies in question are the same, then it would appear that there has been duplicate publication of the same results, which is not normally considered to be a good ethical practice," writes Brown. "On the other hand, if the two studies are not the same, then the exact match between the vast majority of their results represents quite a surprising coincidence."
In a January paper titled "Statistical heartburn: An attempt to digest four pizza publications from the Cornell Food and Brand Lab," researchers Tim van der Zee, Jordan Anaya, and Nicholas Brown analyze four articles from Wansink and his colleagues, finding "a remarkably high number of apparent errors and inconsistencies." These include:
- many instances in which the mean or standard deviation given was impossible given the sample size stated in the same table. ("For example, with a sample size of 10 any mean reported to two decimal places must always have a zero in the second decimal place; yet, this table contains means of 2.25 and 3.92 for a sample size of 10.")
- inconsistent sample sizes and other numbers within and across articles that purportedly use the same data
In total, they found approximately 150 inconsistencies in reported statistics from the four papers.
Those papers are all based on one data set, from a single field-research experiment involving diners at an Italian lunch buffet. "However, none of the articles mentions that they are based on the same data set as their predecessors, even though they were published over a period of many months," the van der Zee study notes. "We consider that this may constitute a breach of good publication ethics practice."
In a now-deleted blog post, Wansink characterized his Italian-buffet experiment as a "failed study which had null results." But when Wansink's first thesis (which he never reveals) didn't pan out, he went looking for ways to reverse-engineer the data he did have into a thesis.
When [a visiting fellow] arrived, I gave her a data set of a self-funded, failed study which had null results (it was a one month study in an all-you-can-eat Italian restaurant buffet where we had charged some people ½ as much as others). I said, "This cost us a lot of time and our own money to collect. There's got to be something here we can salvage because it's a cool (rich & unique) data set."
I had three ideas for potential Plan B, C, & D directions (since Plan A had failed). . . . Every day she came back with puzzling new results, and every day we would scratch our heads, ask 'Why,' and come up with another way to reanalyze the data with yet another set of plausible hypotheses. Eventually we started discovering solutions that held up regardless of how we pressure-tested them. . . .
Wansink's colleagues—including Andrew Gelman, director of the Applied Statistics Center at Columbia University—quickly called him out on this cavalier attitude toward theory and apparent willingness to hack statistics into something publishable.
"Statistical tests are a messy business, our criteria are not stringent, the samples are small, something is bound to come up as significant if we look hard enough," noted neuroscientist Ana Todorovic. "Now, it's significant whether we looked at it or not–testing the data in many different ways is not the problem. The problem is not reporting all the other variables that were collected and all the other tests that were carried out. Because if we knew this was one result out of, say, 200 tests, then we would be less likely to give it much credence."
Gelman called Wansink's work "run-of-the-mill, everyday, bread-and-butter junk science."
As Jesse Singal at New York asked in March: "What possible reason is there to trust [the Cornell Food and Brand Lab's] output at all, let alone for journalists to continue to publicize its findings?"
Or, we might add, for the federal government to continue funding its implementation?
Far from being a well-tested approach, the Smarter Lunchrooms program was implemented in schools across America before the first randomized control trials on its effectiveness were even begun. "The speed at which this intervention approach has been implemented in schools surprised me because of the limited and low quality evidence base supporting it," wrote Robinson in the most recent comprehensive critique of Wansink's work.
The first serious study testing the program's effectiveness was published just this year. At the end of nine weeks, students in Smarter Lunchroom cafeterias consumed an average of 0.10 more fruit units per day—the equivalent of about one or two bites of an apple. Wansink and company called it a "significant" increase in fruit consumption.
But "whether this increase is meaningful and has real world benefit is questionable," Robinson* writes.
Nonetheless, the USDA claims that the "strategies that the Smarter Lunchrooms Movement endorses have been studied and proven effective in a variety of schools across the nation." More than 29,000 U.S. public schools now employ Smarter Lunchrooms strategies, and the number of school food service directors trained on these tactics increased threefold in 2015 over the year before.
One study touted by the USDA even notes that since food service directors who belong to professional membership associations were more likely to know about the Smarter Lunchrooms program, policy makers and school districts "consider allocating funds to encourage [directors] to engage more fully in professional association meetings and activities."
Earlier this year, Wansink responded to criticism by promising that he and his colleagues would review the data involved in questionable studies and make it publicly available. But this has not happened yet, and the blog post where Wansink made this offer has since been deleted.
A spokesman for the USDA told The Washington Post that while they had some concerns about the research coming out of Cornell, "it's important to remember that Smarter Lunchrooms strategies are based upon widely researched principles of behavioral economics, as well as a strong body of practice that supports their ongoing use."
Some might wonder what it hurts to put fruit in front of chips in school-cafeteria lines or ask lunchroom staff to consider aesthetics. For libertarians, one answer is obvious: we are shelling out all sorts of money and government resources to incentivize activity with no measurable benefit (all while creating new administrative positions and layers of bureaucracy to oversee these efforts).
Not only is there scant evidence to support the pricey program, the "Smarter Lunchrooms" boondoggle also distracts from figuring out how to really reform school cafeterias—something that folks on all sides should find concerning.
We might disagree on whether federal authorities should micromanage lunchroom menus or if local school districts should have more control, and what dietary principles they should follow; whether the emphasis of school cafeterias should be fundraising or nutrition; or whether school meals need more funding. But confronting these challenges head-on is a hell of a lot better than a tepid consensus for feel-good fairytales about banana placement.
*Correction: Eric Robinson, not Anderson, is the author quoted here.
Photo Credit: USDA/Facebook