Automating genetic analysis helps keep up with rapid discovery of new diseases
Stanford researchers are devising ways to have computers help perform some of the intensive genetic analysis now performed manually when scientists study a patient's genome to diagnose a disease.
When Shayla Haddock was born in 1997, her parents immediately realized something was wrong. The sixth of seven children, Shayla had unusual facial features. She had club feet and shorter-than-normal limbs. She was smaller than most newborns. Hearing tests showed she was deaf.
As her parents, Cheryl and Levko Siloti, searched for answers about her condition, they worried: Had some preventable event during Cheryl’s pregnancy caused Shayla’s symptoms? Could identifying her diagnosis improve her treatment options? If Shayla’s siblings wanted to become parents someday, would their children be at risk for the same illness?
“It was kind of an emotional roller coaster,” Cheryl Siloti said. Over the years, doctors suggested many diagnoses for Shayla, but medical tests repeatedly disproved their theories. “We would get these possibilities and then hear ‘Nope, that’s not the answer.’”
The Stockton, California, family’s quest for answers illustrates the challenges of diagnosing rare genetic diseases, and illustrates how and why scientists at the Stanford University School of Medicine are devising new approaches to help.
As much as Shayla’s parents longed for a diagnosis, they almost didn’t get one. On Aug. 10, 2012 — only two weeks after Shayla’s doctors at Lucile Packard Children’s Hospital Stanford concluded that they could not match her genetic patterns and symptoms to a disease — a scientific report about a newly discovered link between a genetic defect and a rare disease was published that would have allowed them to diagnose her. But at the time, genetic-testing results were not routinely re-analyzed to take into account new knowledge. The family and doctors remained unaware that the answer was out there.
Last year, as part of a scientific study, Shayla’s parents agreed to have her genome re-analyzed. This time, Stanford computer scientists used new computational tools they had developed to compare Shayla’s gene sequences to the scientific literature. They found the 2012 scientific report and predicted that Shayla had a rare genetic disease called Wiedemann-Steiner syndrome, which her doctors confirmed.
“With each passing month, more of the world’s genetic diversity is represented in scientific databases, and each time more information is there, it’s easier to interpret the next thing you see,” said Jon Bernstein, MD, Shayla’s clinical geneticist at Packard Children’s and an author of the new report, which was published online July 21 in Genetics in Medicine. Ten percent of the patients in the study — four individuals, including Shayla, out of 40 who did not receive diagnoses after their first genetic analysis — were diagnosed with various rare diseases based on recent discoveries, even though the initial analyses had been conducted an average of only 20 months earlier.
These “near misses” highlight a big challenge in the realm of precision health: Although the speed, cost and effort involved in obtaining individuals’ genetic sequences has dropped dramatically in recent years, it still requires about 20 to 40 hours of work by trained experts to match a patient’s rare mutations to information in the scientific literature that might reveal a diagnosis. Among patients suspected of having a rare genetic disease, 75 percent aren’t diagnosed the first time they have their DNA analyzed. And yet the knowledge base is growing fast. Each year, researchers discover the cause of about 250 genetic diseases and also find 9,200 links between specific gene variants and known diseases.
Too many to diagnose by hand
“Our study demonstrates that reanalysis of patients’ gene-testing results is useful because there’s a steady rate of discovery,” said Bernstein, who is also an associate professor of pediatrics at the School of Medicine.
“But there is no way we’ll have enough manpower to continue to do all the analysis manually, as clinicians and scientists have done in the past,” said Gill Bejerano, PhD, senior author of the study and associate professor of developmental biology, of computer science and of pediatrics.
Bejerano led the computer scientists who devised the automated approach used in the new research. Several million Americans may have some form of rare genetic disease, he noted — too many to diagnose by hand. “Rather than continuing to invest dozens of hours in each patient’s analysis, our team thought it made more sense to spend that time building computer science tools that can do much of the work for us,” he said.
In the new study, the scientists tested whether automated comparisons between undiagnosed patients’ genomes and existing gene databases could accelerate diagnosis. The approach worked.
“The genome is ultimately a programming language,” Bejerano said. “We really would like to use machine learning and other approaches to build computer systems that leave as little as possible work for the human expert. A computer is going to be weaker than a human at doing this, but we think we can take the process 80 to 90 percent of the way by computer and provide a huge time savings for the human in the loop.”
Comparing genes of patients, parents
Another key finding from the new research, according to Bernstein and Bejerano, is that comparing patients’ gene sequences to those of their parents greatly speeds the diagnostic process. Such comparisons help turn up new disease-causing mutations that occurred in the patients but are not present in their parents. “These things stand out more easily if you have the parents’ data in front you,” Bernstein said.
In Shayla’s case, her diagnosis brought her family the answers they’d long been seeking. She doesn’t share her disease-causing mutation with her parents; instead, it occurred spontaneously in her. It wasn’t preventable, nor is there any expectation it would affect her siblings’ children. “It really relieves a lot of worry to know that,” Siloti said.
The diagnosis also has helped the Silotis find other families whose children have the same diagnosis. They share stories on a Facebook group and feel they’ve found a new sense of support and community. “We’ve always believed that knowledge is power,” Siloti said. “It is wonderful to have some answers, especially after such a long search.”
The co-lead authors of the new study are research scientists Aaron Wenger, PhD, and Harendra Guturu, PhD. The research was funded by Stanford’s Department of Pediatrics, the Stanford Discovery Fund, the Defense Advanced Research Projects Agency and the National Institutes of Health (grant U01MH105949).
Why it’s hard to diagnose a rare genetic disease, by the numbers:
- Size of the human genome: 3 billion bases, each a single letter in the “words” that make up our genes and the genomic regions that control them.
- Number of Americans estimated to have some form of rare genetic disease: 25 million
- Annual number of single-gene diseases whose cause is newly identified: 250
- Annual number of genetic changes newly linked to existing diseases: 9,200
- Time required to analyze one person’s genetic information when a rare genetic disease is suspected: 20 to 40 hours
- Proportion of patients suspected of having a rare, single-gene disease who get a diagnosis the first time their genetic sequence is analyzed: 25 percent
- Typical frequency of re-analysis to help undiagnosed patients be matched to new discoveries: It has been rare, but is expected to become more common thanks to new, automated techniques for gene analysis.
Stanford Medicine integrates research, medical education and health care at its three institutions - Stanford University School of Medicine, Stanford Health Care (formerly Stanford Hospital & Clinics), and Lucile Packard Children's Hospital Stanford. For more information, please visit the Office of Communication & Public Affairs site at http://mednews.stanford.edu.