Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts.
A toolkit for genetics providers in follow-up of patients with non-diagnostic exome sequencing.
Journal of genetic counseling
2019; 28 (2): 213–28
It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene1. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches2-5. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases6-8. This includes muscle biopsies from patients with undiagnosed rare muscle disorders6,9, and cultured fibroblasts from patients with mitochondrial disorders7. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution.
View details for DOI 10.1038/s41591-019-0457-8
View details for PubMedID 31160820
Diagnosing rare diseases after the exome.
Cold Spring Harbor molecular case studies
2018; 4 (6)
There are approximately 7,000 rare diseases affecting 25-30 million Americans, with 80% estimated to have a genetic basis. This presents a challenge for genetics practitioners to determine appropriate testing, make accurate diagnoses, and conduct up-to-date patient management. Exome sequencing (ES) is a comprehensive diagnostic approach, but only 25%-41% of the patients receive a molecular diagnosis. The remaining three-fifths to three-quarters of patients undergoing ES remain undiagnosed. The Stanford Center for Undiagnosed Diseases (CUD), a clinical site of the Undiagnosed Diseases Network, evaluates patients with undiagnosed and rare diseases using a combination of methods including ES. Frequently these patients have non-diagnostic ES results, but strategic follow-up techniques identify diagnoses in a subset. We present techniques used at the CUD that can be adopted by genetics providers in clinical follow-up of cases where ES is non-diagnostic. Solved case examples illustrate different types of non-diagnostic results and the additional techniques that led to a diagnosis. Frequent approaches include segregation analysis, data reanalysis, genome sequencing, additional variant identification, careful phenotype-disease correlation, confirmatory testing, and case matching. We also discuss prioritization of cases for additional analyses.
View details for PubMedID 30964584
Biallelic Mutations in ATP5F1D, which Encodes a Subunit of ATP Synthase, Cause a Metabolic Disorder
AMERICAN JOURNAL OF HUMAN GENETICS
2018; 102 (3): 494–504
High-throughput sequencing has ushered in a diversity of approaches for identifying genetic variants and understanding genome structure and function. When applied to individuals with rare genetic diseases, these approaches have greatly accelerated gene discovery and patient diagnosis. Over the past decade, exome sequencing has emerged as a comprehensive and cost-effective approach to identify pathogenic variants in the protein-coding regions of the genome. However, for individuals in whom exome-sequencing fails to identify a pathogenic variant, we discuss recent advances that are helping to reduce the diagnostic gap.
View details for PubMedID 30559314
Whole transcriptome sequencing in blood provides a diagnosis of spinal muscular atrophy with progressive myoclonic epilepsy (SMA-PME).
ATP synthase, H+ transporting, mitochondrial F1 complex, δ subunit (ATP5F1D; formerly ATP5D) is a subunit of mitochondrial ATP synthase and plays an important role in coupling proton translocation and ATP production. Here, we describe two individuals, each with homozygous missense variants in ATP5F1D, who presented with episodic lethargy, metabolic acidosis, 3-methylglutaconic aciduria, and hyperammonemia. Subject 1, homozygous for c.245C>T (p.Pro82Leu), presented with recurrent metabolic decompensation starting in the neonatal period, and subject 2, homozygous for c.317T>G (p.Val106Gly), presented with acute encephalopathy in childhood. Cultured skin fibroblasts from these individuals exhibited impaired assembly of F1FO ATP synthase and subsequent reduced complex V activity. Cells from subject 1 also exhibited a significant decrease in mitochondrial cristae. Knockdown of Drosophila ATPsynδ, the ATP5F1D homolog, in developing eyes and brains caused a near complete loss of the fly head, a phenotype that was fully rescued by wild-type human ATP5F1D. In contrast, expression of the ATP5F1D c.245C>T and c.317T>G variants rescued the head-size phenotype but recapitulated the eye and antennae defects seen in other genetic models of mitochondrial oxidative phosphorylation deficiency. Our data establish c.245C>T (p.Pro82Leu) and c.317T>G (p.Val106Gly) in ATP5F1D as pathogenic variants leading to a Mendelian mitochondrial disease featuring episodic metabolic decompensation.
View details for PubMedID 29478781
Long-read genome sequencing identifies causal structural variation in a Mendelian disease.
Genetics in medicine : official journal of the American College of Medical Genetics
At least 15% of the disease-causing mutations affect mRNA splicing. Many splicing mutations are missed in a clinical setting due to limitations of in silico prediction algorithms or their location in noncoding regions. Whole-transcriptome sequencing is a promising new tool to identify these mutations; however, it will be a challenge to obtain disease-relevant tissue for RNA. Here, we describe an individual with a sporadic atypical spinal muscular atrophy, in whom clinical DNA sequencing reported one pathogenic ASAH1 mutation (c.458A>G;p.Tyr153Cys). Transcriptome sequencing on patient leukocytes identified a highly significant and atypical ASAH1 isoform not explained by c.458A>G(p<10(-16) ). Subsequent Sanger-sequencing identified the splice mutation responsible for the isoform (c.504A>C;p.Lys168Asn) and provided a molecular diagnosis of autosomal-recessive spinal muscular atrophy with progressive myoclonic epilepsy. Our findings demonstrate the utility of RNA sequencing from blood to identify splice-impacting disease mutations for nonhematological conditions, providing a diagnosis for these otherwise unsolved patients.
View details for DOI 10.1002/humu.23211
View details for PubMedID 28251733
Genetic effects on gene expression across human tissues.
2017; 550 (7675): 204–13
PurposeCurrent clinical genomics assays primarily utilize short-read sequencing (SRS), but SRS has limited ability to evaluate repetitive regions and structural variants. Long-read sequencing (LRS) has complementary strengths, and we aimed to determine whether LRS could offer a means to identify overlooked genetic variation in patients undiagnosed by SRS.MethodsWe performed low-coverage genome LRS to identify structural variants in a patient who presented with multiple neoplasia and cardiac myxomata, in whom the results of targeted clinical testing and genome SRS were negative.ResultsThis LRS approach yielded 6,971 deletions and 6,821 insertions > 50 bp. Filtering for variants that are absent in an unrelated control and overlap a disease gene coding exon identified three deletions and three insertions. One of these, a heterozygous 2,184 bp deletion, overlaps the first coding exon of PRKAR1A, which is implicated in autosomal dominant Carney complex. RNA sequencing demonstrated decreased PRKAR1A expression. The deletion was classified as pathogenic based on guidelines for interpretation of sequence variants.ConclusionThis first successful application of genome LRS to identify a pathogenic variant in a patient suggests that LRS has significant potential for the identification of disease-causing structural variation. Larger studies will ultimately be required to evaluate the potential clinical utility of LRS.GENETICS in MEDICINE advance online publication, 22 June 2017; doi:10.1038/gim.2017.86.
View details for PubMedID 28640241
Directed evolution using dCas9-targeted somatic hypermutation in mammalian cells.
Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.
View details for PubMedID 29022597
The Extent of mRNA Editing Is Limited in Chicken Liver and Adipose, but Impacted by Tissular Context, Genotype, Age, and Feeding as Exemplified with a Conserved Edited Site in COG3
G3-GENES GENOMES GENETICS
2016; 6 (2): 321-335
An Efficient Multiple-Testing Adjustment for eQTL Studies that Accounts for Linkage Disequilibrium between Variants
AMERICAN JOURNAL OF HUMAN GENETICS
2016; 98 (1): 216-224
An Efficient Multiple-Testing Adjustment for eQTL Studies that Accounts for Linkage Disequilibrium between Variants.
American journal of human genetics
2016; 98 (1): 216–24
Engineering and study of protein function by directed evolution has been limited by the technical requirement to use global mutagenesis or introduce DNA libraries. Here, we develop CRISPR-X, a strategy to repurpose the somatic hypermutation machinery for protein engineering in situ. Using catalytically inactive dCas9 to recruit variants of cytidine deaminase (AID) with MS2-modified sgRNAs, we can specifically mutagenize endogenous targets with limited off-target damage. This generates diverse libraries of localized point mutations and can target multiple genomic locations simultaneously. We mutagenize GFP and select for spectrum-shifted variants, including EGFP. Additionally, we mutate the target of the cancer therapeutic bortezomib, PSMB5, and identify known and novel mutations that confer bortezomib resistance. Finally, using a hyperactive AID variant, we mutagenize loci both upstream and downstream of transcriptional start sites. These experiments illustrate a powerful approach to create complex libraries of genetic variants in native context, which is broadly applicable to investigate and improve protein function.
View details for DOI 10.1038/nmeth.4038
View details for PubMedID 27798611
Genome-Wide Characterization of RNA Editing in Chicken Embryos Reveals Common Features among Vertebrates
2015; 10 (5)
Methods for multiple-testing correction in local expression quantitative trait locus (cis-eQTL) studies are a trade-off between statistical power and computational efficiency. Bonferroni correction, though computationally trivial, is overly conservative and fails to account for linkage disequilibrium between variants. Permutation-based methods are more powerful, though computationally far more intensive. We present an alternative correction method called eigenMT, which runs over 500 times faster than permutations and has adjusted p values that closely approximate empirical ones. To achieve this speed while also maintaining the accuracy of permutation-based methods, we estimate the effective number of independent variants tested for association with a particular gene, termed Meff, by using the eigenvalue decomposition of the genotype correlation matrix. We employ a regularized estimator of the correlation matrix to ensure Meff is robust and yields adjusted p values that closely approximate p values from permutations. Finally, using a common genotype matrix, we show that eigenMT can be applied with even greater efficiency to studies across tissues or conditions. Our method provides a simpler, more efficient approach to multiple-testing correction than existing methods and fits within existing pipelines for eQTL discovery.
View details for PubMedID 26749306
Third Report on Chicken Genes and Chromosomes 2015.
Cytogenetic and genome research
2015; 145 (2): 78-179
Transcriptome-wide investigation of genomic imprinting in chicken
NUCLEIC ACIDS RESEARCH
2014; 42 (6): 3768-3782
RNA editing results in a post-transcriptional nucleotide change in the RNA sequence that creates an alternative nucleotide not present in the DNA sequence. This leads to a diversification of transcription products with potential functional consequences. Two nucleotide substitutions are mainly described in animals, from adenosine to inosine (A-to-I) and from cytidine to uridine (C-to-U). This phenomenon is described in more details in mammals, notably since the availability of next generation sequencing technologies allowing whole genome screening of RNA-DNA differences. The number of studies recording RNA editing in other vertebrates like chicken is still limited. We chose to use high throughput sequencing technologies to search for RNA editing in chicken, and to extend the knowledge of its conservation among vertebrates. We performed sequencing of RNA and DNA from 8 embryos. Being aware of common pitfalls inherent to sequence analyses that lead to false positive discovery, we stringently filtered our datasets and found fewer than 40 reliable candidates. Conservation of particular sites of RNA editing was attested by the presence of 3 edited sites previously detected in mammals. We then characterized editing levels for selected candidates in several tissues and at different time points, from 4.5 days of embryonic development to adults, and observed a clear tissue-specificity and a gradual increase of editing level with time. By characterizing the RNA editing landscape in chicken, our results highlight the extent of evolutionary conservation of this phenomenon within vertebrates, attest to its tissue and stage specificity and provide support of the absence of non A-to-I events from the chicken transcriptome.
View details for DOI 10.1371/journal.pone.0126776
View details for Web of Science ID 000355319400019
View details for PubMedID 26024316
Epigenetics and phenotypic variability: some interesting insights from birds
GENETICS SELECTION EVOLUTION
Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken.
View details for DOI 10.1093/nar/gkt1390
View details for Web of Science ID 000334758600032
View details for PubMedID 24452801
Fine mapping of complex traits in non-model species: using next generation sequencing and advanced intercross lines in Japanese quail
Little is known about epigenetic mechanisms in birds with the exception of the phenomenon of dosage compensation of sex chromosomes, although such mechanisms could be involved in the phenotypic variability of birds, as in several livestock species. This paper reviews the literature on epigenetic mechanisms that could contribute significantly to trait variability in birds, and compares the results to the existing knowledge of epigenetic mechanisms in mammals. The main issues addressed in this paper are: (1) Does genomic imprinting exist in birds? (2) How does the embryonic environment influence the adult phenotype in avian species? (3) Does the embryonic environment have an impact on phenotypic variability across several successive generations? The potential for epigenetic studies to improve the performance of individual animals through the implementation of limited changes in breeding conditions or the addition of new parameters in selection models is still an open question.
View details for DOI 10.1186/1297-9686-45-16
View details for Web of Science ID 000320957100001
View details for PubMedID 23758635
As for other non-model species, genetic analyses in quail will benefit greatly from a higher marker density, now attainable thanks to the evolution of sequencing and genotyping technologies. Our objective was to obtain the first genome wide panel of Japanese quail SNP (Single Nucleotide Polymorphism) and to use it for the fine mapping of a QTL for a fear-related behaviour, namely tonic immobility, previously localized on Coturnix japonica chromosome 1. To this aim, two reduced representations of the genome were analysed through high-throughput 454 sequencing: AFLP (Amplified Fragment Length Polymorphism) fragments as representatives of genomic DNA, and EST (Expressed Sequence Tag) as representatives of the transcriptome.The sequencing runs produced 399,189 and 1,106,762 sequence reads from cDNA and genomic fragments, respectively. They covered over 434 Mb of sequence in total and allowed us to detect 17,433 putative SNP. Among them, 384 were used to genotype two Advanced Intercross Lines (AIL) obtained from three quail lines differing for duration of tonic immobility. Despite the absence of genotyping for founder individuals in the analysis, the previously identified candidate region on chromosome 1 was refined and led to the identification of a candidate gene.These data confirm the efficiency of transcript and AFLP-sequencing for SNP discovery in a non-model species, and its application to the fine mapping of a complex trait. Our results reveal a significant association of duration of tonic immobility with a genomic region comprising the DMD (dystrophin) gene. Further characterization of this candidate gene is needed to decipher its putative role in tonic immobility in Coturnix.
View details for DOI 10.1186/1471-2164-13-551
View details for Web of Science ID 000312956800001
View details for PubMedID 23066875