A Stanford study shows that using genomes from a diverse pool of people improves the ability to predict an individual’s risk of having high cholesterol.
December 8, 2021 - By Tracie White
Including data from people of diverse ancestries substantially improves certain genetic risk predictions in all populations, according to a study that examined the genes responsible for regulating blood cholesterol levels.
Researchers have noted that a “polygenic risk score,” a DNA-based tool that enhances our ability to predict the risk of common diseases, works well only for people of northern European ancestry, primarily because that’s where much of the data has been collected.
“The biggest weakness of using these risk scores in the clinic is that they have the potential to exacerbate health inequities because they perform much better in whites and perform the worst in African Americans,” said Themistocles “Tim” Assimes, MD, PhD, an associate professor of cardiovascular medicine at Stanford University School of Medicine.
Assimes is a co-senior author of the study with Cristen Willer, PhD, a professor of internal medicine, human genetics, and computational medicine and bioinformatics at the University of Michigan. They and a team of more than 600 co-authors published their findings in Nature Dec. 8. “We find that diversifying the populations under study, rather than simply increasing sample size, is the most effective approach to creating polygenic risk scores that work equally well in predicting high cholesterol in all populations,” said Assimes, who is also a co-director of VA Palo Alto Epidemiology and Informatics Center for Genomics. “We hope that the same is observed for other health traits, but this remains to be seen.”
The study, which examined the genetic variants associated with cholesterol levels for more than 1.65 million people, also showed that by including population samples from a wide variety of ancestry groups, the critical gene variants involved with bad cholesterol were found more quickly.
“The message couldn’t be clearer: To have a fuller understanding of the effects of genomic variation on disease, we simply must include as many diverse groups as possible,” said Charles Rotimi, PhD, a co-author on the paper and scientific director of the National Human Genome Research Institute.
The researchers combined a huge amount of data through a meta-analysis organized by the Global Lipids Genetics Consortium, which brought together genome-wide association data from more than 200 primary studies around the globe. Those analyses were performed by the many co-authors of the study.
The Million Veteran Program, one of the world’s most diverse biobanks with approximately 850,000 U.S. veterans, was instrumental in increasing the diversity of the study, said Assimes, a senior investigator for the program. About one-quarter of the veterans are either Black or Hispanic.
Needle in a haystack
Determining disease risk is complicated. Diseases caused by single-gene mutations, such as cystic fibrosis or sickle cell anemia, are the exception. For most common illnesses, genetics, environmental exposures and lifestyle factors all come into play. The role of genetics in causing most diseases, like coronary heart disease, usually hovers around 50% and is not linked to one mutation but to hundreds and likely thousands of them, each with subtle effects on risk, Assimes said. Finding all those mutations isn’t an easy job.
In general, the bigger the sample size for narrowing down the hunt for the right variants, the better. That has been the philosophy behind years of genome-wide association studies that have identified thousands of variants for a variety of diseases.
For this study, researchers accumulated data from 201 previous genomewide association studies of lipids, fat-like substances found in the blood and body tissues that come in two major forms — cholesterol and triglycerides. Elevated lipid levels are risk factors for many heart conditions.
Their sample included about 1.65 million individuals from five ancestral groups: African, East Asian, European, Hispanic and South Asian. About 1.3 million of the genomes were from people with European ancestry, with the remaining 350,000 from non-European ancestries.
Working with an expanded database allowed the researchers to more accurately pinpoint the genomic variants that are strongly linked to blood lipid levels. This is critical not only for predicting levels of blood cholesterol using polygenic risk scores but also for understanding the molecular causes of atherosclerotic heart disease and locating new drug targets, Assimes said.
The researchers created risk scores for high cholesterol using data from each of the different ancestral groups separately, then all together. Results showed the risk score that included the diverse genomic data was equally predictive, for all ancestries, of whether a person might one day have high cholesterol levels.
“Even though the percentages of diverse populations were still low, it made a big difference in the results,” said Shoa Clarke, MD, PhD, instructor of cardiovascular medicine at Stanford and a co-author of the study.
The work was supported by a Veterans Affairs grant (l01-BX003362-03A1).
About Stanford Medicine
Stanford Medicine is an integrated academic health system comprising the Stanford School of Medicine and adult and pediatric health care delivery systems. Together, they harness the full potential of biomedicine through collaborative research, education and clinical care for patients. For more information, please visit med.stanford.edu.