5 Questions: Alice Popejoy on race, ethnicity and ancestry in science

Alice Popejoy, a postdoctoral scholar who studies biomedical data sciences, speaks to the role — and pitfalls — of race, ethnicity and ancestry in research.

Including race, ethnicity or ancestry in a scientific study can produce misleading results that present sociocultural factors, such as race, as a biological cause of certain diseases.

In clinical research, scientists often invoke race, ethnicity and ancestry to better understand underlying factors that contribute to disease, even when the connection is not quite clear. This approach is prevalent in clinical genetics, a field of study that harnesses genetic testing to understand aspects of a patient’s personal health. But while race- or ancestry-based information can play an important role in health research — such as ensuring a particular clinical study represents diverse populations — its use in science can be misguided, said Alice Popejoy, PhD, a postdoctoral scholar who studies the intersection of public health and genetics. Including race, ethnicity or ancestry in a scientific study can produce misleading results that present sociocultural factors, such as race, as a biological cause of certain diseases — when, in fact, environmental factors or actual biology, such as genetic mutations, may underlie the disease.

To better understand the use of race, ethnicity and ancestry in clinical genetics, Popejoy and her mentors Carlos Bustamante, PhD, professor of biomedical data science and of genetics, and Kelly Ormond, MS, professor of genetics, led a team of scientists to conduct a nationwide survey asking clinical genetics professionals and researchers about the importance of race, ethnicity and ancestry in their work, including questions about how they define the terms and use the concepts as variables in their research or clinical practice.

The study was published June 2 in the American Journal of Human Genetics.

In a conversation with science writer Hanae Armitage, Popejoy discussed the role of race, ethnicity and ancestry in research.

1. What is the difference between race, ethnicity and ancestry?

Popejoy: There really aren’t universal definitions of race, ethnicity and ancestry, which is likely part of the reason there’s so much confusion about what they mean and how they overlap, especially in science. Race, for instance, is more often used in a sociopolitical context — as a construct that’s broadly tied to societal hierarchies of power. 

Often with ethnicity, people tend to use the term in a more cultural or community-based context. Sometimes people will use race and ethnicity interchangeably, as some may feel more comfortable saying ethnicity because race can invoke discomfort related to racism. Ancestry is an interesting one, because people have really personal opinions about what ancestry means to them. For some, the term ancestry might invoke feelings about culture or heritage, and may even have spiritual undertones. But when I talk about ancestry, it’s in terms of genetics, and it’s derived from genomic analyses. We’re talking about the fact that we all inherit pieces of DNA, little bits of genetic material, from our parents who inherited it from their parents, and so on. 

2. Is there a role for race, ethnicity and ancestry in science, and if so, what is it?

Popejoy: That’s one of the most common questions I get asked in my field. If race and ethnicity are social or cultural constructs, why include them in our research at all, when genetic ancestry seems so much more scientific? And the answer is that unfortunately, in a society plagued by systemic and institutionalized racism, sociocultural identity has a very real impact on health, as not everyone has equal access to nutrition, education and health care. So, race is not biological, but it does have real biological effects in an unequal society.

In the case of ancestry, it’s a little more ambiguous; we’re still trying to figure out the most accurate and useful applications of genetic ancestry in research and medicine. In genomics, ancestry is often used to account for clusters of individuals in a sample population who are more closely related to one another, and their shared genomic background could impact the results of a genetic-association study. In clinical genetics, a person’s ancestral background is also important for tracking the representation of groups from different genetic ancestries in population databases; adequate representation of these groups ensures the results of a genetic test are interpreted appropriately. Let’s say you get a genetic test, the results come back and the doctor says, “I’m sorry. We see a genetic variant of unknown or uncertain significance, and we don’t know how to interpret this for you.” One of the reasons they might not be able to interpret that variant is that it’s never been seen in any of the databases, meaning the patient may be of a certain ancestral background that is not well represented. We don’t know whether it’s missing from the database because it’s really rare and likely to be causing the patient’s condition, or if it is somewhat common and benign, or not disease-causing, in that person’s ancestral population, but the population has not been adequately sampled, as most of the genetic research that’s been done has been in people of European ancestry. 

Alice Popejoy is the lead author of a study on how clinical genetics professionals and researchers consider the concepts of race, ethnicity and ancestry in their work.
Joseph Perez

3. What are some of the pitfalls of using race, ethnicity and ancestry in clinical genomics, or scientific research overall?

Popejoy: I recently wrote a review of diversity in populations represented in the literature on pharmacogenomics — that is, using genetics to individualize a therapeutic approach. Among the studies that described the demographics of their study population — most did not, which is also problematic — the vast majority used old-school, binary terms that compared Black people with white people, even if masked by labels such as European American versus African American or European descent versus African descent. When we design studies based on hypotheses steeped in the Black-white narrative that is as old as this country, it does a disservice to Black individuals, and to science as a whole, by reifying biological racism. Designing a study that sets out to find some genetic difference between Black and white people reinforces the false idea that there are innate genetic differences between racial groups — other than a sliver of genes that code for pigmentation and hair texture — that convey health benefits to people of one group but not another. For example, it’s not just African Americans or even people of African ancestry who are at risk of developing sickle cell anemia. People with ancestry from other parts of the world where malaria has been endemic, such as India, the Middle East and the Mediterranean, also have an elevated risk of developing the disease.Too often, scientists and the general public oversimplify these nuances by trying to fit people into boxes.

What’s most important is that researchers understand exactly why it is they’re using race, ethnicity or ancestry in their study. If they’re using something like race as a variable to inform ancestral population structure, that isn’t going to yield accurate results because it’s equating something that’s discrete and based in a social-cultural realm with something that’s genetic and exists on a spectrum of diversity. 

4. What did your study reveal about the understanding of race, ethnicity and ancestry in the context of research?

Popejoy: We wanted to conduct a survey to understand where the field is with respect to people’s understanding of race, ethnicity and ancestry, and how those variables are used in research and clinical genetics. We issued the survey to almost 450 clinical genetics professionals and researchers around the world, and we found that, by and large, there is no consistent understanding of the differences between race, ethnicity and ancestry. This comes with one exception, which is that scientists generally agree that ancestry was “well” or “very well” described by the term “genetic lineage group.” What was really interesting, though, was the fact that across the board people did not have a consistent understanding of what the terms meant or how to distinguish among them, yet the majority felt that all three were at least somewhat important for their work. Our findings revealed there is much more work to be done in terms of research on the utility of diversity measures, and in terms of educating and informing the research and clinical genetics professions about the use of race, ethnicity and ancestry.

5. Where do we go from here?

Popejoy: As a community of biomedical researchers and health care providers, we need to recognize and understand the historic practices and biases that we inherited from the foundations of our fields that misuse race, ethnicity or ancestry to ensure both scientific accuracy and clinical validity, while ensuring that our work does not reinforce and perpetuate racism. We must think critically about why we’re incorporating race, ethnicity or ancestry into a study and ask what benefit it provides and if that variable is correctly aligned with the goals of the study. 

Stanford Medicine integrates research, medical education and health care at its three institutions - Stanford University School of Medicine, Stanford Health Care (formerly Stanford Hospital & Clinics), and Lucile Packard Children's Hospital Stanford. For more information, please visit the Office of Communication & Public Affairs site at http://mednews.stanford.edu.

COVID-19 Updates

Stanford Medicine is closely monitoring the outbreak of novel coronavirus (COVID-19). A dedicated page provides the latest information and developments related to the pandemic.

Leading In Precision Health

Stanford Medicine is leading the biomedical revolution in precision health, defining and developing the next generation of care that is proactive, predictive and precise.