Path-scan: a reporting tool for identifying clinically actionable variants.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
2014; 19: 229-240
The American College of Medical Genetics and Genomics (ACMG) recently released guidelines regarding the reporting of incidental findings in sequencing data. Given the availability of Direct to Consumer (DTC) genetic testing and the falling cost of whole exome and genome sequencing, individuals will increasingly have the opportunity to analyze their own genomic data. We have developed a web-based tool, PATH-SCAN, which annotates individual genomes and exomes for ClinVar designated pathogenic variants found within the genes from the ACMG guidelines. Because mutations in these genes predispose individuals to conditions with actionable outcomes, our tool will allow individuals or researchers to identify potential risk variants in order to consult physicians or genetic counselors for further evaluation. Moreover, our tool allows individuals to anonymously submit their pathogenic burden, so that we can crowd source the collection of quantitative information regarding the frequency of these variants. We tested our tool on 1092 publicly available genomes from the 1000 Genomes project, 163 genomes from the Personal Genome Project, and 15 genomes from a clinical genome sequencing research project. Excluding the most commonly seen variant in 1000 Genomes, about 20% of all genomes analyzed had a ClinVar designated pathogenic variant that required further evaluation.
View details for PubMedID 24297550
Genome-wide profiling of human cap-independent translation-enhancing elements
2013; 10 (8): 747-?
We report an in vitro selection strategy to identify RNA sequences that mediate cap-independent initiation of translation. This method entails mRNA display of trillions of genomic fragments, selection for initiation of translation and high-throughput deep sequencing. We identified >12,000 translation-enhancing elements (TEEs) in the human genome, generated a high-resolution map of human TEE-bearing regions (TBRs), and validated the function of a subset of sequences in vitro and in cultured cells.
View details for DOI 10.1038/nmeth.2522
View details for Web of Science ID 000322453600023
View details for PubMedID 23770754
Systematic functional regulatory assessment of disease-associated variants.
Proceedings of the National Academy of Sciences of the United States of America
2013; 110 (23): 9607-9612
Genome-wide association studies have discovered many genetic loci associated with disease traits, but the functional molecular basis of these associations is often unresolved. Genome-wide regulatory and gene expression profiles measured across individuals and diseases reflect downstream effects of genetic variation and may allow for functional assessment of disease-associated loci. Here, we present a unique approach for systematic integration of genetic disease associations, transcription factor binding among individuals, and gene expression data to assess the functional consequences of variants associated with hundreds of human diseases. In an analysis of genome-wide binding profiles of NF?B, we find that disease-associated SNPs are enriched in NF?B binding regions overall, and specifically for inflammatory-mediated diseases, such as asthma, rheumatoid arthritis, and coronary artery disease. Using genome-wide variation in transcription factor-binding data, we find that NF?B binding is often correlated with disease-associated variants in a genotype-specific and allele-specific manner. Furthermore, we show that this binding variation is often related to expression of nearby genes, which are also found to have altered expression in independent profiling of the variant-associated disease condition. Thus, using this integrative approach, we provide a unique means to assign putative function to many disease-associated SNPs.
View details for DOI 10.1073/pnas.1219099110
View details for PubMedID 23690573
Performance of computational tools in evaluating the functional impact of laboratory-induced amino acid mutations
2012; 28 (16): 2093-2096
Site-directed mutagenesis is frequently used by scientists to investigate the functional impact of amino acid mutations in the laboratory. Over 10,000 such laboratory-induced mutations have been reported in the UniProt database along with the outcomes of functional assays. Here, we explore the performance of state-of-the-art computational tools (Condel, PolyPhen-2 and SIFT) in correctly annotating the function-altering potential of 10,913 laboratory-induced mutations from 2372 proteins. We find that computational tools are very successful in diagnosing laboratory-induced mutations that elicit significant functional change in the laboratory (up to 92% accuracy). But, these tools consistently fail in correctly annotating laboratory-induced mutations that show no functional impact in the laboratory assays. Therefore, the overall accuracy of computational tools for laboratory-induced mutations is much lower than that observed for the naturally occurring human variants. We tested and rejected the possibilities that the preponderance of changes to alanine and the presence of multiple base-pair mutations in the laboratory were the reasons for the observed discordance between the performance of computational tools for natural and laboratory mutations. Instead, we discover that the laboratory-induced mutations occur predominately at the highly conserved positions in proteins, where the computational tools have the lowest accuracy of correct prediction for variants that do not impact function (neutral). Therefore, the comparisons of experimental-profiling results with those from computational predictions need to be sensitive to the evolutionary conservation of the positions harboring the amino acid change.
View details for DOI 10.1093/bioinformatics/bts336
View details for Web of Science ID 000307501100001
View details for PubMedID 22685075