Cell-Type Specific Features of Circular RNA Expression
Cell-Type Specific Features of Circular RNA Expression
Statistical properties of an early stopping rule for resampling-based multiple testing
2012; 99 (4): 973-980
Circular RNAs Are the Predominant Transcript Isoform from Hundreds of Human Genes in Diverse Cell Types
2012; 7 (2)
Improved Discovery of Molecular Interactions in Genome-Scale Data with Adaptive Model-Based Normalization
2013; 8 (1)
Most human pre-mRNAs are spliced into linear molecules that retain the exon order defined by the genomic sequence. By deep sequencing of RNA from a variety of normal and malignant human cells, we found RNA transcripts from many human genes in which the exons were arranged in a non-canonical order. Statistical estimates and biochemical assays provided strong evidence that a substantial fraction of the spliced transcripts from hundreds of genes are circular RNAs. Our results suggest that a non-canonical mode of RNA splicing, resulting in a circular RNA isoform, is a general feature of the gene expression program in human cells.
View details for DOI 10.1371/journal.pone.0030733
View details for Web of Science ID 000301977500016
View details for PubMedID 22319583
ESRRA-C11orf20 Is a Recurrent Gene Fusion in Serous Ovarian Carcinoma
2011; 9 (9)
High throughput molecular-interaction studies using immunoprecipitations (IP) or affinity purifications are powerful and widely used in biology research. One of many important applications of this method is to identify the set of RNAs that interact with a particular RNA-binding protein (RBP). Here, the unique statistical challenge presented is to delineate a specific set of RNAs that are enriched in one sample relative to another, typically a specific IP compared to a non-specific control to model background. The choice of normalization procedure critically impacts the number of RNAs that will be identified as interacting with an RBP at a given significance threshold - yet existing normalization methods make assumptions that are often fundamentally inaccurate when applied to IP enrichment data.In this paper, we present a new normalization methodology that is specifically designed for identifying enriched RNA or DNA sequences in an IP. The normalization (called adaptive or AD normalization) uses a basic model of the IP experiment and is not a variant of mean, quantile, or other methodology previously proposed. The approach is evaluated statistically and tested with simulated and empirical data.The adaptive (AD) normalization method results in a greatly increased range in the number of enriched RNAs identified, fewer false positives, and overall better concordance with independent biological evidence, for the RBPs we analyzed, compared to median normalization. The approach is also applicable to the study of pairwise RNA, DNA and protein interactions such as the analysis of transcription factors via chromatin immunoprecipitation (ChIP) or any other experiments where samples from two conditions, one of which contains an enriched subset of the other, are studied.
View details for DOI 10.1371/journal.pone.0053930
View details for Web of Science ID 000314019100038
View details for PubMedID 23349766
Statistical Modeling of RNA-Seq Data
2011; 26 (1): 62-83
Proteome-Wide Search Reveals Unexpected RNA-Binding Proteins in Saccharomyces cerevisiae
2010; 5 (9)
Every year, ovarian cancer kills approximately 14,000 women in the United States and more than 140,000 women worldwide. Most of these deaths are caused by tumors of the serous histological type, which is rarely diagnosed before it has disseminated. By deep paired-end sequencing of mRNA from serous ovarian cancers, followed by deep sequencing of the corresponding genomic region, we identified a recurrent fusion transcript. The fusion transcript joins the 5' exons of ESRRA, encoding a ligand-independent member of the nuclear-hormone receptor superfamily, to the 3' exons of C11orf20, a conserved but uncharacterized gene located immediately upstream of ESRRA in the reference genome. To estimate the prevalence of the fusion, we tested 67 cases of serous ovarian cancer by RT-PCR and sequencing and confirmed its presence in 10 of these. Targeted resequencing of the corresponding genomic region from two fusion-positive tumor samples identified a nearly clonal chromosomal rearrangement positioning ESRRA upstream of C11orf20 in one tumor, and evidence of local copy number variation in the ESRRA locus in the second tumor. We hypothesize that the recurrent novel fusion transcript may play a role in pathogenesis of a substantial fraction of serous ovarian cancers and could provide a molecular marker for detection of the cancer. Gene fusions involving adjacent or nearby genes can readily escape detection but may play important roles in the development and progression of cancer.
View details for DOI 10.1371/journal.pbio.1001156
View details for Web of Science ID 000295372800012
View details for PubMedID 21949640
The vast landscape of RNA-protein interactions at the heart of post-transcriptional regulation remains largely unexplored. Indeed it is likely that, even in yeast, a substantial fraction of the regulatory RNA-binding proteins (RBPs) remain to be discovered. Systematic experimental methods can play a key role in discovering these RBPs--most of the known yeast RBPs lack RNA-binding domains that might enable this activity to be predicted. We describe here a proteome-wide approach to identify RNA-protein interactions based on in vitro binding of RNA samples to yeast protein microarrays that represent over 80% of the yeast proteome. We used this procedure to screen for novel RBPs and RNA-protein interactions. A complementary mass spectrometry technique also identified proteins that associate with yeast mRNAs. Both the protein microarray and mass spectrometry methods successfully identify previously annotated RBPs, suggesting that other proteins identified in these assays might be novel RBPs. Of 35 putative novel RBPs identified by either or both of these methods, 12, including 75% of the eight most highly-ranked candidates, reproducibly associated with specific cellular RNAs. Surprisingly, most of the 12 newly discovered RBPs were enzymes. Functional characteristics of the RNA targets of some of the novel RBPs suggest coordinated post-transcriptional regulation of subunits of protein complexes and a possible link between mRNA trafficking and vesicle transport. Our results suggest that many more RBPs still remain to be identified and provide a set of candidates for further investigation.
View details for DOI 10.1371/journal.pone.0012671
View details for Web of Science ID 000281687300015
View details for PubMedID 20844764