The Volokh Conspiracy

Mostly law professors | Sometimes contrarian | Often libertarian | Always independent

On Genetic Diversity and Vaccine Studies

Genetic diversity in medical research may be a good thing; using American racial categories is not a good way to achieve that diversity.


Last week, I expressed my outrage that Moderna had delayed its Covid-19 vaccine trials to ensure that it had "enough" diversity in its vaccine subjects, a concern I deemed unscientific that cost thousands of lives. I got some pushback to the effect that there is evidence that people of different genetic backgrounds sometimes have different reactions to vaccines, and thus genetic diversity in research subjects is important.

Let's assume arguendo that such differences have enough medical significance sufficiently often that it is worthwhile to spend resources to ensure genetic diversity in vaccine research subjects (i.e., the volunteers who receive the vaccines during the research phase). The problem is that Moderna (and other researchers), following FDA regulations, is not looking for genetic diversity, but diversity as defined by U.S. statistical categories–white, Asian, African American/black, Native American, and Hispanic. While regulations ensuring representation from each category may increase the diversity of research subjects, they are very poorly designed to do so.

I noted in my previous post that "Asian American" as used in government statistics is an incoherent category, encompassing everyone from Caucasian Indians to East Asians to Austronesian Filipinos. Similarly, Hispanic is a linguistic/cultural category not a genetic one; African American encompasses everyone from Ethiopians, who are more closely related genetically to Jews than to sub-Saharan Africans, to those sub-Saharans, and includes people with substantial Native American and European heritage; and white includes people with origins everywhere from Ireland to Yemen and Afganistan.

So let's say a pharmaceutical company is looking to meet FDA guidelines, and also satisfy activists, by ensuring representation from American statistical groups. The company has its subsidiary in Ireland do its "white" testing locally. But wait! DNA researchers conclude that people from the Middle East can be divided into as many as four separate populations, and Europeans divided into as many as eight separate populations. Using Irish subjects provides "whites," but not a genetically diverse group of whites.

The company also hires the Mayo Clinic to find African-American subjects. Mayo, based in Rochester, Minnesota, sends its team to Minneapolis to recruit, and finds the right number of subjects. But most African Americans in Minneapolis are Somalis. Not only does this not reflect the broader population of Africans and African Americans, Somalis are genetically more similar to people in Saudi Arabia they are to people from other parts of Africa. So Mayo has added "diversity" to the study, but not much genetic diversity, and none that would pick up, say, whether people of West African origin have unique reactions to the vaccine.

Our hypothetical pharmaceutical company also has a subsidiary in Mumbai, India, which recruits "Asian" subjects for the study locally. But Indians are Caucasians, and have no particular genetic relationship with East Asians, who are themselves genetically diverse.

Finally, our company is based in Miami, and recruits its "Hispanic" subjects from the local Cuban-American population. Cubans have the highest percentage of European ancestry of all major American Hispanic groups, and the lowest amount of indigenous ancestry. Are they genetically representative of the Hispanic population? Only to the extent that they are mostly of Spanish origin, and Spain is the dominant genetic contributor to the overall Hispanic population. [Please note: In my previous post, I cited a study suggesting that the average American Hispanic is about 75% European in ancestry. I have been directed to other studies concluding that the percentage is more like 55%.]

Using these subjects would totally satisfy FDA "diversity in medical research" rules, even though they in fact provide limited genetic diversity, and fail to represent large swathes of both the American and general world population. The company would likely have achieved greater genetic diversity in its subject just by trying to get a random sample of volunteers from the broad American population. So if genetic diversity is indeed important, FDA regulations should be rewritten to require genetic diversity, rather than using American racial categories as a very crude proxy.

That said, the FDA regulations did not come about because of scientific concerns about achieving genetic diversity in medical research, but from lobbying by activists who wanted to achieve "representation" in medical research. As Steven Epstein explains in his excellent book, Inclusion: The Politics of Difference in Medical Research, the push by activists for "diversity" in medical research had the following goals:

(1) Statistical representation. Groups should be included in medical research according to their approximate percentages of the U.S. population. [Note there is no good scientific reason for this to proportional, as opposed, to say, equal numbers from each group.]

(2) Social visibility: Both researchers and research subjects should reflect American "diversity."

(3) Political voice: Researchers should ensure their research is addressing the needs of less-advantaged groups.

(4) Symbolic representation: Medical researchers must be allies in ensuring that society understands the medically-related problems facing various social groups.

Note that none of these goals are "ensuring that studies are scientifically valid," nor has the FDA ever really tried to explain why using American statistical categories is a sound scientific way to identify research subjects. For that matter, the FDA does not distinguish between clinical studies on disease and the effectiveness and safety of medications and vaccines, where American racial categories have little salience, and studies on sociological factors affecting public health, where they sometimes have salience (if used with proper caution). Faced with pressure to require attention to race and ethnicity in medical and scientific research, the FDA took the path of least resistance and simply required researchers to universally adopt the unscientific categories used by HHS and the rest of the government. (The FDA does allow, but does not require, companies to undertake more granular analysis.)

These rules make medical research a bit more expensive, which is itself a harm. In the Moderna case, when time was truly of the essence, they cost many lives for no scientific gain. More generally, as an editorial in Nature Genetics argued in 2004, the "use of race as a proxy is inhibiting scientists from doing their job of separating and identifying the real environmental and genetic causes of disease."

Twenty years ago, scientists predicted the use of race would subside in favor of much more scientifically salient DNA-based studies; instead, thanks in large part to the FDA (and NIH, which has similar regulations) "race" is more prevalent than ever. And many scientists believe that rather than advancing medical research, this use of race leads the medical profession to "(mis)attribute causality and thereby (mis)identify health care needs." George T.H. Ellison, et al., Racial Categories in Medicine: A Failure of Evidence-Based Practice?, PloS Medicine, Sept. 2007, e287, at 1434, 35.