Education & Certifications
Bachelor of Science, Santa Clara University, Biology (2009)
Bachelor of Science, Santa Clara University, Bioengineering (2009)
Please visit my personal page at:
We describe an approach for targeted genome resequencing, called oligonucleotide-selective sequencing (OS-Seq), in which we modify the immobilized lawn of oligonucleotide primers of a next-generation DNA sequencer to function as both a capture and sequencing substrate. We apply OS-Seq to resequence the exons of either 10 or 344 cancer genes from human DNA samples. In our assessment of capture performance, >87% of the captured sequence originated from the intended target region with sequencing coverage falling within a tenfold range for a majority of all targets. Single nucleotide variants (SNVs) called from OS-Seq data agreed with >95% of variants obtained from whole-genome sequencing of the same individual. We also demonstrate mutation discovery from a colorectal cancer tumor sample matched with normal tissue. Overall, we show the robust performance and utility of OS-Seq for the resequencing analysis of human germline and cancer genomes.
View details for DOI 10.1038/nbt.1996
View details for Web of Science ID 000296801300024
View details for PubMedID 22020387
With next-generation DNA sequencing technologies, one can interrogate a specific genomic region of interest at very high depth of coverage and identify less prevalent, rare mutations in heterogeneous clinical samples. However, the mutation detection levels are limited by the error rate of the sequencing technology as well as by the availability of variant-calling algorithms with high statistical power and low false positive rates. We demonstrate that we can robustly detect mutations at 0.1% fractional representation. This represents accurate detection of one mutant per every 1000 wild-type alleles. To achieve this sensitive level of mutation detection, we integrate a high accuracy indexing strategy and reference replication for estimating sequencing error variance. We employ a statistical model to estimate the error rate at each position of the reference and to quantify the fraction of variant base in the sample. Our method is highly specific (99%) and sensitive (100%) when applied to a known 0.1% sample fraction admixture of two synthetic DNA samples to validate our method. As a clinical application of this method, we analyzed nine clinical samples of H1N1 influenza A and detected an oseltamivir (antiviral therapy) resistance mutation in the H1N1 neuraminidase gene at a sample fraction of 0.18%.
View details for DOI 10.1093/nar/gkr861
View details for Web of Science ID 000298733500002
View details for PubMedID 22013163
We have developed an integrated strategy for targeted resequencing and analysis of gene subsets from the human exome for variants. Our capture technology is geared towards resequencing gene subsets substantially larger than can be done efficiently with simplex or multiplex PCR but smaller in scale than exome sequencing. We describe all the steps from the initial capture assay to single nucleotide variant (SNV) discovery. The capture methodology uses in-solution 80-mer oligonucleotides. To provide optimal flexibility in choosing human gene targets, we designed an in silico set of oligonucleotides, the Human OligoExome, that covers the gene exons annotated by the Consensus Coding Sequencing Project (CCDS). This resource is openly available as an Internet accessible database where one can download capture oligonucleotides sequences for any CCDS gene and design custom capture assays. Using this resource, we demonstrated the flexibility of this assay by custom designing capture assays ranging from 10 to over 100 gene targets with total capture sizes from over 100 Kilobases to nearly one Megabase. We established a method to reduce capture variability and incorporated indexing schemes to increase sample throughput. Our approach has multiple applications that include but are not limited to population targeted resequencing studies of specific gene subsets, validation of variants discovered in whole genome sequencing surveys and possible diagnostic analysis of disease gene subsets. We also present a cost analysis demonstrating its cost-effectiveness for large population studies.
View details for DOI 10.1371/journal.pone.0021088
View details for Web of Science ID 000292291800008
View details for PubMedID 21738606
Intra- and interspecific variation in flower color is a hallmark of angiosperm diversity. The evolutionary forces underlying the variety of flower colors can be nearly as diverse as the colors themselves. In addition to pollinator preferences, non-pollinator agents of selection can have a major influence on the evolution of flower color polymorphisms, especially when the pigments in question are also expressed in vegetative tissues. In such cases, identifying the target(s) of selection starts with determining the biochemical and molecular basis for the flower color variation and examining any pleiotropic effects manifested in vegetative tissues. Herein, we describe a widespread purple-white flower color polymorphism in the mustard Parrya nudicaulis spanning Alaska. The frequency of white-flowered individuals increases with increasing growing-season temperature, consistent with the role of anthocyanin pigments in stress tolerance. White petals fail to produce the stress responsive flavonoid intermediates in the anthocyanin biosynthetic pathway (ABP), suggesting an early pathway blockage. Petal cDNA sequences did not reveal blockages in any of the eight enzyme-coding genes in white-flowered individuals, nor any color differentiating SNPs. A qRT-PCR analysis of white petals identified a 24-fold reduction in chalcone synthase (CHS) at the threshold of the ABP, but no change in CHS expression in leaves and sepals. This arctic species has avoided the deleterious effects associated with the loss of flavonoid intermediates in vegetative tissues by decoupling CHS expression in petals and leaves, yet the correlation of flower color and climate suggests that the loss of flavonoids in the petals alone may affect the tolerance of white-flowered individuals to colder environments.
View details for DOI 10.1371/journal.pone.0018230
View details for Web of Science ID 000289238700005
View details for PubMedID 21490971
Critical to conservation efforts and other investigations at low taxonomic levels, DNA sequence data offer important insights into the distinctiveness, biogeographic partitioning and evolutionary histories of species. The resolving power of DNA sequences is often limited by insufficient variability at the intraspecific level. This is particularly true of studies involving plant organelles, as the conservative mutation rate of chloroplasts and mitochondria makes it difficult to detect polymorphisms necessary to track genealogical relationships among individuals, populations and closely related taxa, through space and time. Massively parallel sequencing (MPS) makes it possible to acquire entire organelle genome sequences to identify cryptic variation that would be difficult to detect otherwise. We are using MPS to evaluate intraspecific chloroplast-level divergence across biogeographic boundaries in narrowly endemic and widespread species of Pinus. We focus on one of the world's rarest pines - Torrey pine (Pinus torreyana) - due to its conservation interest and because it provides a marked contrast to more widespread pine species. Detailed analysis of nearly 90% ( approximately 105 000 bp each) of these chloroplast genomes shows that mainland and island populations of Torrey pine differ at five sites in their plastome, with the differences fixed between populations. This is an exceptionally low level of divergence (1 polymorphism/ approximately 21 kb), yet it is comparable to intraspecific divergence present in widespread pine species and species complexes. Population-level organelle genome sequencing offers new vistas into the timing and magnitude of divergence within species, and is certain to provide greater insight into pollen dispersal, migration patterns and evolutionary dynamics in plants.
View details for DOI 10.1111/j.1365-294X.2009.04474.x
View details for Web of Science ID 000275645700010
View details for PubMedID 20331774