Professional Education

  • Bachelor of Arts, Harvard University (2008)
  • Doctor of Philosophy, Harvard University (2012)

Stanford Advisors


All Publications

  • Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations NATURE GENETICS Araya, C. L., Cenik, C., Reuters, J. A., Kiss, G., Pande, V. S., Snyder, M. P., Greenleaf, W. J. 2016; 48 (2): 117-125

    View details for DOI 10.1038/ng.3471

    View details for Web of Science ID 000369043900008

  • Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans GENOME RESEARCH Cenik, C., Cenik, E. S., Byeon, G. W., Grubert, F., Candille, S. I., Spacek, D., Alsallakh, B., Tilgner, H., Araya, C. L., Tang, H., Ricci, E., Snyder, M. P. 2015; 25 (11): 1610-1621


    Elucidating the consequences of genetic differences between humans is essential for understanding phenotypic diversity and personalized medicine. Although variation in RNA levels, transcription factor binding, and chromatin have been explored, little is known about global variation in translation and its genetic determinants. We used ribosome profiling, RNA sequencing, and mass spectrometry to perform an integrated analysis in lymphoblastoid cell lines from a diverse group of individuals. We find significant differences in RNA, translation, and protein levels suggesting diverse mechanisms of personalized gene expression control. Combined analysis of RNA expression and ribosome occupancy improves the identification of individual protein level differences. Finally, we identify genetic differences that specifically modulate ribosome occupancy--many of these differences lie close to start codons and upstream ORFs. Our results reveal a new level of gene expression variation among humans and indicate that genetic variants can cause changes in protein levels through effects on translation.

    View details for DOI 10.1101/gr.193342.115

    View details for Web of Science ID 000364355600003

    View details for PubMedID 26297486

    View details for PubMedCentralID PMC4617958

  • ASPeak: an abundance sensitive peak detection algorithm for RIP-Seq. Bioinformatics Kucukural, A., Ozadam, H., Singh, G., Moore, M. J., Cenik, C. 2013; 29 (19): 2485-2486


    Unlike DNA, RNA abundances can vary over several orders of magnitude. Thus, identification of RNA-protein binding sites from high-throughput sequencing data presents unique challenges. Although peak identification in ChIP-Seq data has been extensively explored, there are few bioinformatics tools tailored for peak calling on analogous datasets for RNA-binding proteins. Here we describe ASPeak (abundance sensitive peak detection algorithm), an implementation of an algorithm that we previously applied to detect peaks in exon junction complex RNA immunoprecipitation in tandem experiments. Our peak detection algorithm yields stringent and robust target sets enabling sensitive motif finding and downstream functional analyses.ASPeak is implemented in Perl as a complete pipeline that takes bedGraph files as input. ASPeak implementation is freely available at under the GNU General Public License. ASPeak can be run on a personal computer, yet is designed to be easily parallelizable. ASPeak can also run on high performance computing clusters providing efficient speedup. The documentation and user manual can be obtained from or

    View details for DOI 10.1093/bioinformatics/btt428

    View details for PubMedID 23929032

  • The Cellular EJC Interactome Reveals Higher-Order mRNP Structure and an EJC-SR Protein Nexus CELL Singh, G., Kucukural, A., Cenik, C., Leszyk, J. D., Shaffer, S. A., Weng, Z., Moore, M. J. 2012; 151 (4): 750-764


    In addition to sculpting eukaryotic transcripts by removing introns, pre-mRNA splicing greatly impacts protein composition of the emerging mRNP. The exon junction complex (EJC), deposited upstream of exon-exon junctions after splicing, is a major constituent of spliced mRNPs. Here, we report comprehensive analysis of the endogenous human EJC protein and RNA interactomes. We confirm that the major "canonical" EJC occupancy site in vivo lies 24 nucleotides upstream of exon junctions and that the majority of exon junctions carry an EJC. Unexpectedly, we find that endogenous EJCs multimerize with one another and with numerous SR proteins to form megadalton sized complexes in which SR proteins are super-stoichiometric to EJC core factors. This tight physical association may explain known functional parallels between EJCs and SR proteins. Further, their protection of long mRNA stretches from nuclease digestion suggests that endogenous EJCs and SR proteins cooperate to promote mRNA packaging and compaction.

    View details for DOI 10.1016/j.cell.2012.10.007

    View details for Web of Science ID 000310921200008

    View details for PubMedID 23084401

  • Genome Analysis Reveals Interplay between 5 ' UTR Introns and Nuclear mRNA Export for Secretory and Mitochondrial Genes PLOS GENETICS Cenik, C., Chua, H. N., Zhang, H., Tarnawsky, S. P., Akef, A., Derti, A., Tasan, M., Moore, M. J., Palazzo, A. F., Roth, F. P. 2011; 7 (4)


    In higher eukaryotes, messenger RNAs (mRNAs) are exported from the nucleus to the cytoplasm via factors deposited near the 5' end of the transcript during splicing. The signal sequence coding region (SSCR) can support an alternative mRNA export (ALREX) pathway that does not require splicing. However, most SSCR-containing genes also have introns, so the interplay between these export mechanisms remains unclear. Here we support a model in which the furthest upstream element in a given transcript, be it an intron or an ALREX-promoting SSCR, dictates the mRNA export pathway used. We also experimentally demonstrate that nuclear-encoded mitochondrial genes can use the ALREX pathway. Thus, ALREX can also be supported by nucleotide signals within mitochondrial-targeting sequence coding regions (MSCRs). Finally, we identified and experimentally verified novel motifs associated with the ALREX pathway that are shared by both SSCRs and MSCRs. Our results show strong correlation between 5' untranslated region (5'UTR) intron presence/absence and sequence features at the beginning of the coding region. They also suggest that genes encoding secretory and mitochondrial proteins share a common regulatory mechanism at the level of mRNA export.

    View details for DOI 10.1371/journal.pgen.1001366

    View details for Web of Science ID 000289977000021

    View details for PubMedID 21533221

  • Identification of Neuronal RNA Targets of TDP-43-containing Ribonucleoprotein Complexes JOURNAL OF BIOLOGICAL CHEMISTRY Sephton, C. F., Cenik, C., Kucukural, A., Dammer, E. B., Cenik, B., Han, Y., Dewey, C. M., Roth, F. P., Herz, J., Peng, J., Moore, M. J., Yu, G. 2011; 286 (2): 1204-1215


    TAR DNA-binding protein 43 (TDP-43) is associated with a spectrum of neurodegenerative diseases. Although TDP-43 resembles heterogeneous nuclear ribonucleoproteins, its RNA targets and physiological protein partners remain unknown. Here we identify RNA targets of TDP-43 from cortical neurons by RNA immunoprecipitation followed by deep sequencing (RIP-seq). The canonical TDP-43 binding site (TG)(n) is 55.1-fold enriched, and moreover, a variant with adenine in the middle, (TG)(n)TA(TG)(m), is highly abundant among reads in our TDP-43 RIP-seq library. TDP-43 RNA targets can be divided into three different groups: those primarily binding in introns, in exons, and across both introns and exons. TDP-43 RNA targets are particularly enriched for Gene Ontology terms related to synaptic function, RNA metabolism, and neuronal development. Furthermore, TDP-43 binds to a number of RNAs encoding for proteins implicated in neurodegeneration, including TDP-43 itself, FUS/TLS, progranulin, Tau, and ataxin 1 and -2. We also identify 25 proteins that co-purify with TDP-43 from rodent brain nuclear extracts. Prominent among them are nuclear proteins involved in pre-mRNA splicing and RNA stability and transport. Also notable are two neuron-enriched proteins, methyl CpG-binding protein 2 and polypyrimidine tract-binding protein 2 (PTBP2). A PTBP2 consensus RNA binding motif is enriched in the TDP-43 RIP-seq library, suggesting that PTBP2 may co-regulate TDP-43 RNA targets. This work thus reveals the protein and RNA components of the TDP-43-containing ribonucleoprotein complexes and provides a framework for understanding how dysregulation of TDP-43 in RNA metabolism contributes to neurodegeneration.

    View details for DOI 10.1074/jbc.M110.190884

    View details for Web of Science ID 000286005000032

    View details for PubMedID 21051541

  • Genome-wide functional analysis of human 5 ' untranslated region introns GENOME BIOLOGY Cenik, C., Derti, A., Mellor, J. C., Berriz, G. F., Roth, F. P. 2010; 11 (3)


    Approximately 35% of human genes contain introns within the 5' untranslated region (UTR). Introns in 5'UTRs differ from those in coding regions and 3'UTRs with respect to nucleotide composition, length distribution and density. Despite their presumed impact on gene regulation, the evolution and possible functions of 5'UTR introns remain largely unexplored.We performed a genome-scale computational analysis of 5'UTR introns in humans. We discovered that the most highly expressed genes tended to have short 5'UTR introns rather than having long 5'UTR introns or lacking 5'UTR introns entirely. Although we found no correlation in 5'UTR intron presence or length with variance in expression across tissues, which might have indicated a broad role in expression-regulation, we observed an uneven distribution of 5'UTR introns amongst genes in specific functional categories. In particular, genes with regulatory roles were surprisingly enriched in having 5'UTR introns. Finally, we analyzed the evolution of 5'UTR introns in non-receptor protein tyrosine kinases (NRTK), and identified a conserved DNA motif enriched within the 5'UTR introns of human NRTKs.Our results suggest that human 5'UTR introns enhance the expression of some genes in a length-dependent manner. While many 5'UTR introns are likely to be evolving neutrally, their relationship with gene expression and overrepresentation among regulatory genes, taken together, suggest that complex evolutionary forces are acting on this distinct class of introns.

    View details for DOI 10.1186/gb-2010-11-3-r29

    View details for Web of Science ID 000277309100009

    View details for PubMedID 20222956

  • A common class of transcripts with 5'-intron depletion, distinct early coding sequence features, and N-1-methyladenosine modification RNA Cenik, C., Chua, H. N., Singh, G., Akef, A., Snyder, M. P., Palazzo, A. F., Moore, M. J., Roth, F. P. 2017; 23 (3): 270-283
  • Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations. Nature genetics Araya, C. L., Cenik, C., Reuter, J. A., Kiss, G., Pande, V. S., Snyder, M. P., Greenleaf, W. J. 2016; 48 (2): 117-125


    Cancer sequencing studies have primarily identified cancer driver genes by the accumulation of protein-altering mutations. An improved method would be annotation independent, sensitive to unknown distributions of functions within proteins and inclusive of noncoding drivers. We employed density-based clustering methods in 21 tumor types to detect variably sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and noncoding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs demonstrate spatial clustering of alterations in molecular domains and at interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated across tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest that mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally agnostic driver identification.

    View details for DOI 10.1038/ng.3471

    View details for PubMedID 26691984

  • An optimized kit-free method for making strand-specific deep sequencing libraries from RNA fragments. Nucleic acids research Heyer, E. E., Ozadam, H., Ricci, E. P., Cenik, C., Moore, M. J. 2015; 43 (1)


    Deep sequencing of strand-specific cDNA libraries is now a ubiquitous tool for identifying and quantifying RNAs in diverse sample types. The accuracy of conclusions drawn from these analyses depends on precise and quantitative conversion of the RNA sample into a DNA library suitable for sequencing. Here, we describe an optimized method of preparing strand-specific RNA deep sequencing libraries from small RNAs and variably sized RNA fragments obtained from ribonucleoprotein particle footprinting experiments or fragmentation of long RNAs. Our approach works across a wide range of input amounts (400 pg to 200 ng), is easy to follow and produces a library in 2-3 days at relatively low reagent cost, all while giving the user complete control over every step. Because all enzymatic reactions were optimized and driven to apparent completion, sequence diversity and species abundance in the input sample are well preserved.

    View details for DOI 10.1093/nar/gku1235

    View details for PubMedID 25505164

  • Staufen1 senses overall transcript secondary structure to regulate translation. Nature structural & molecular biology Ricci, E. P., Kucukural, A., Cenik, C., Mercier, B. C., Singh, G., Heyer, E. E., Ashar-Patel, A., Peng, L., Moore, M. J. 2014; 21 (1): 26–35


    Human Staufen1 (Stau1) is a double-stranded RNA (dsRNA)-binding protein implicated in multiple post-transcriptional gene-regulatory processes. Here we combined RNA immunoprecipitation in tandem (RIPiT) with RNase footprinting, formaldehyde cross-linking, sonication-mediated RNA fragmentation and deep sequencing to map Staufen1-binding sites transcriptome wide. We find that Stau1 binds complex secondary structures containing multiple short helices, many of which are formed by inverted Alu elements in annotated 3' untranslated regions (UTRs) or in 'strongly distal' 3' UTRs. Stau1 also interacts with actively translating ribosomes and with mRNA coding sequences (CDSs) and 3' UTRs in proportion to their GC content and propensity to form internal secondary structure. On mRNAs with high CDS GC content, higher Stau1 levels lead to greater ribosome densities, thus suggesting a general role for Stau1 in modulating translation elongation through structured CDS regions. Our results also indicate that Stau1 regulates translation of transcription-regulatory proteins.

    View details for DOI 10.1038/nsmb.2739

    View details for PubMedID 24336223

  • ASPeak: an abundance sensitive peak detection algorithm for RIP-Seq BIOINFORMATICS Kucukural, A., Oezadam, H., Singh, G., Moore, M. J., Cenik, C. 2013; 29 (19): 2485-2486
  • RanBP2/Nup358 Potentiates the Translation of a Subset of mRNAs Encoding Secretory Proteins PLOS BIOLOGY Mahadevan, K., Zhang, H., Akef, A., Cui, X. A., Gueroussov, S., Cenik, C., Roth, F. P., Palazzo, A. F. 2013; 11 (4)


    In higher eukaryotes, most mRNAs that encode secreted or membrane-bound proteins contain elements that promote an alternative mRNA nuclear export (ALREX) pathway. Here we report that ALREX-promoting elements also potentiate translation in the presence of upstream nuclear factors. These RNA elements interact directly with, and likely co-evolved with, the zinc finger repeats of RanBP2/Nup358, which is present on the cytoplasmic face of the nuclear pore. Finally we show that RanBP2/Nup358 is not only required for the stimulation of translation by ALREX-promoting elements, but is also required for the efficient global synthesis of proteins targeted to the endoplasmic reticulum (ER) and likely the mitochondria. Thus upon the completion of export, mRNAs containing ALREX-elements likely interact with RanBP2/Nup358, and this step is required for the efficient translation of these mRNAs in the cytoplasm. ALREX-elements thus act as nucleotide platforms to coordinate various steps of post-transcriptional regulation for the majority of mRNAs that encode secreted proteins.

    View details for DOI 10.1371/journal.pbio.1001545

    View details for Web of Science ID 000318687800023

    View details for PubMedID 23630457

  • Introns in UTRs: Why we should stop ignoring them BIOESSAYS Bicknell, A. A., Cenik, C., Chua, H. N., Roth, F. P., Moore, M. J. 2012; 34 (12): 1025-1034


    Although introns in 5'- and 3'-untranslated regions (UTRs) are found in many protein coding genes, rarely are they considered distinctive entities with specific functions. Indeed, mammalian transcripts with 3'-UTR introns are often assumed nonfunctional because they are subject to elimination by nonsense-mediated decay (NMD). Nonetheless, recent findings indicate that 5'- and 3'-UTR intron status is of significant functional consequence for the regulation of mammalian genes. Therefore these features should be ignored no longer.

    View details for DOI 10.1002/bies.201200073

    View details for Web of Science ID 000311113700010

    View details for PubMedID 23108796

  • Pacific Salmon and the Coalescent Effective Population Size PLOS ONE Cenik, C., Wakeley, J. 2010; 5 (9)


    Pacific salmon include several species that are both commercially important and endangered. Understanding the causes of loss in genetic variation is essential for designing better conservation strategies. Here we use a coalescent approach to analyze a model of the complex life history of salmon, and derive the coalescent effective population (CES). With the aid of Kronecker products and a convergence theorem for Markov chains with two time scales, we derive a simple formula for the CES and thereby establish its existence. Our results may be used to address important questions regarding salmon biology, in particular about the loss of genetic variation. To illustrate the utility of our approach, we consider the effects of fluctuations in population size over time. Our analysis enables the application of several tools of coalescent theory to the case of salmon.

    View details for DOI 10.1371/journal.pone.0013019

    View details for Web of Science ID 000282167100020

    View details for PubMedID 20885947

  • Absence of Evidence for MHC-Dependent Mate Selection within HapMap Populations PLOS GENETICS Derti, A., Cenik, C., Kraft, P., Roth, F. P. 2010; 6 (4)


    The major histocompatibility complex (MHC) of immunity genes has been reported to influence mate choice in vertebrates, and a recent study presented genetic evidence for this effect in humans. Specifically, greater dissimilarity at the MHC locus was reported for European-American mates (parents in HapMap Phase 2 trios) than for non-mates. Here we show that the results depend on a few extreme data points, are not robust to conservative changes in the analysis procedure, and cannot be reproduced in an equivalent but independent set of European-American mates. Although some evidence suggests an avoidance of extreme MHC similarity between mates, rather than a preference for dissimilarity, limited sample sizes preclude a rigorous investigation. In summary, fine-scale molecular-genetic data do not conclusively support the hypothesis that mate selection in humans is influenced by the MHC locus.

    View details for DOI 10.1371/journal.pgen.1000925

    View details for Web of Science ID 000277354200041

    View details for PubMedID 20442868

  • Next generation software for functional trend analysis BIOINFORMATICS Berriz, G. F., Beaver, J. E., Cenik, C., Tasan, M., Roth, F. P. 2009; 25 (22): 3043-3044


    FuncAssociate is a web application that discovers properties enriched in lists of genes or proteins that emerge from large-scale experimentation. Here we describe an updated application with a new interface and several new features. For example, enrichment analysis can now be performed within multiple gene- and protein-naming systems. This feature avoids potentially serious translation artifacts to which other enrichment analysis strategies are subject.The FuncAssociate web application is freely available to all users at

    View details for DOI 10.1093/bioinformatics/btp498

    View details for Web of Science ID 000271564300026

    View details for PubMedID 19717575