Extensive transcriptional heterogeneity revealed by isoform profiling.
2013; 497 (7447): 127-131
Transcript function is determined by sequence elements arranged on an individual RNA molecule. Variation in transcripts can affect messenger RNA stability, localization and translation, or produce truncated proteins that differ in localization or function. Given the existence of overlapping, variable transcript isoforms, determining the functional impact of the transcriptome requires identification of full-length transcripts, rather than just the genomic regions that are transcribed. Here, by jointly determining both transcript ends for millions of RNA molecules, we reveal an extensive layer of isoform diversity previously hidden among overlapping RNA molecules. Variation in transcript boundaries seems to be the rule rather than the exception, even within a single population of yeast cells. Over 26 major transcript isoforms per protein-coding gene were expressed in yeast. Hundreds of short coding RNAs and truncated versions of proteins are concomitantly encoded by alternative transcript isoforms, increasing protein diversity. In addition, approximately 70% of genes express alternative isoforms that vary in post-transcriptional regulatory elements, and tandem genes frequently produce overlapping or even bicistronic transcripts. This extensive transcript diversity is generated by a relatively simple eukaryotic genome with limited splicing, and within a genetically homogeneous population of cells. Our findings have implications for genome compaction, evolution and phenotypic diversity between single cells. These data also indicate that isoform diversity as well as RNA abundance should be considered when assessing the functional repertoire of genomes.
View details for DOI 10.1038/nature12121
View details for PubMedID 23615609
- Antisense expression increases gene expression variability and locus interdependency MOLECULAR SYSTEMS BIOLOGY 2011; 7
The Baker's Yeast Diploid Genome Is Remarkably Stable in Vegetative Growth and Meiosis
2010; 6 (9)
Accurate estimates of mutation rates provide critical information to analyze genome evolution and organism fitness. We used whole-genome DNA sequencing, pulse-field gel electrophoresis, and comparative genome hybridization to determine mutation rates in diploid vegetative and meiotic mutation accumulation lines of Saccharomyces cerevisiae. The vegetative lines underwent only mitotic divisions while the meiotic lines underwent a meiotic cycle every ?20 vegetative divisions. Similar base substitution rates were estimated for both lines. Given our experimental design, these measures indicated that the meiotic mutation rate is within the range of being equal to zero to being 55-fold higher than the vegetative rate. Mutations detected in vegetative lines were all heterozygous while those in meiotic lines were homozygous. A quantitative analysis of intra-tetrad mating events in the meiotic lines showed that inter-spore mating is primarily responsible for rapidly fixing mutations to homozygosity as well as for removing mutations. We did not observe 1-2 nt insertion/deletion (in-del) mutations in any of the sequenced lines and only one structural variant in a non-telomeric location was found. However, a large number of structural variations in subtelomeric sequences were seen in both vegetative and meiotic lines that did not affect viability. Our results indicate that the diploid yeast nuclear genome is remarkably stable during the vegetative and meiotic cell cycles and support the hypothesis that peripheral regions of chromosomes are more dynamic than gene-rich central sections where structural rearrangements could be deleterious. This work also provides an improved estimate for the mutational load carried by diploid organisms.
View details for DOI 10.1371/journal.pgen.1001109
View details for Web of Science ID 000282369200047
View details for PubMedID 20838597
Bidirectional promoters generate pervasive transcription in yeast
2009; 457 (7232): 1033-U7
Genome-wide pervasive transcription has been reported in many eukaryotic organisms, revealing a highly interleaved transcriptome organization that involves hundreds of previously unknown non-coding RNAs. These recently identified transcripts either exist stably in cells (stable unannotated transcripts, SUTs) or are rapidly degraded by the RNA surveillance pathway (cryptic unstable transcripts, CUTs). One characteristic of pervasive transcription is the extensive overlap of SUTs and CUTs with previously annotated features, which prompts questions regarding how these transcripts are generated, and whether they exert function. Single-gene studies have shown that transcription of SUTs and CUTs can be functional, through mechanisms involving the generated RNAs or their generation itself. So far, a complete transcriptome architecture including SUTs and CUTs has not been described in any organism. Knowledge about the position and genome-wide arrangement of these transcripts will be instrumental in understanding their function. Here we provide a comprehensive analysis of these transcripts in the context of multiple conditions, a mutant of the exosome machinery and different strain backgrounds of Saccharomyces cerevisiae. We show that both SUTs and CUTs display distinct patterns of distribution at specific locations. Most of the newly identified transcripts initiate from nucleosome-free regions (NFRs) associated with the promoters of other transcripts (mostly protein-coding genes), or from NFRs at the 3' ends of protein-coding genes. Likewise, about half of all coding transcripts initiate from NFRs associated with promoters of other transcripts. These data change our view of how a genome is transcribed, indicating that bidirectionality is an inherent feature of promoters. Such an arrangement of divergent and overlapping transcripts may provide a mechanism for local spreading of regulatory signals-that is, coupling the transcriptional regulation of neighbouring genes by means of transcriptional interference or histone modification.
View details for DOI 10.1038/nature07728
View details for Web of Science ID 000263425400047
View details for PubMedID 19169243
Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2007; 104 (31): 12825-12830
We sequenced the genome of Saccharomyces cerevisiae strain YJM789, which was derived from a yeast isolated from the lung of an AIDS patient with pneumonia. The strain is used for studies of fungal infections and quantitative genetics because of its extensive phenotypic differences to the laboratory reference strain, including growth at high temperature and deadly virulence in mouse models. Here we show that the approximately 12-Mb genome of YJM789 contains approximately 60,000 SNPs and approximately 6,000 indels with respect to the reference S288c genome, leading to protein polymorphisms with a few known cases of phenotypic changes. Several ORFs are found to be unique to YJM789, some of which might have been acquired through horizontal transfer. Localized regions of high polymorphism density are scattered over the genome, in some cases spanning multiple ORFs and in others concentrated within single genes. The sequence of YJM789 contains clues to pathogenicity and spurs the development of more powerful approaches to dissecting the genetic basis of complex hereditary traits.
View details for DOI 10.1073/pnas.0701291104
View details for Web of Science ID 000248603900043
View details for PubMedID 17652520
Genotyping 1000 yeast strains by next-generation sequencing
The throughput of next-generation sequencing machines has increased dramatically over the last few years; yet the cost and time for library preparation have not changed proportionally, thus representing the main bottleneck for sequencing large numbers of samples. Here we present an economical, high-throughput library preparation method for the Illumina platform, comprising a 96-well based method for DNA isolation for yeast cells, a low-cost DNA shearing alternative, and adapter ligation using heat inactivation of enzymes instead of bead cleanups.Up to 384 whole-genome libraries can be prepared from yeast cells in one week using this method, for less than 15 euros per sample. We demonstrate the robustness of this protocol by sequencing over 1000 yeast genomes at ~30x coverage. The sequence information from 768 yeast segregants derived from two divergent S. cerevisiae strains was used to generate a meiotic recombination map at unprecedented resolution. Comparisons to other datasets indicate a high conservation of recombination at a chromosome-wide scale, but differences at the local scale. Additionally, we detected a high degree of aneuploidy (3.6%) by examining the sequencing coverage in these segregants. Differences in allele frequency allowed us to attribute instances of aneuploidy to gains of chromosomes during meiosis or mitosis, both of which showed a strong tendency to missegregate specific chromosomes.Here we present a high throughput workflow to sequence genomes of large number of yeast strains at a low price. We have used this workflow to obtain recombination and aneuploidy data from hundreds of segregants, which can serve as a foundation for future studies of linkage, recombination, and chromosomal aberrations in yeast and higher eukaryotes.
View details for DOI 10.1186/1471-2164-14-90
View details for Web of Science ID 000315034700001
View details for PubMedID 23394869
Natural sequence variants of yeast environmental sensors confer cell-to-cell expression variability.
Molecular systems biology
2013; 9: 695-?
Living systems may have evolved probabilistic bet hedging strategies that generate cell-to-cell phenotypic diversity in anticipation of environmental catastrophes, as opposed to adaptation via a deterministic response to environmental changes. Evolution of bet hedging assumes that genotypes segregating in natural populations modulate the level of intraclonal diversity, which so far has largely remained hypothetical. Using a fluorescent Pmet17-GFP reporter, we mapped four genetic loci conferring to a wild yeast strain an elevated cell-to-cell variability in the expression of MET17, a gene regulated by the methionine pathway. A frameshift mutation in the Erc1p transmembrane transporter, probably resulting from a release of laboratory strains from negative selection, reduced Pmet17-GFP expression variability. At a second locus, cis-regulatory polymorphisms increased mean expression of the Mup1p methionine permease, causing increased expression variability in trans. These results demonstrate that an expression quantitative trait locus (eQTL) can simultaneously have a deterministic effect in cis and a probabilistic effect in trans. Our observations indicate that the evolution of transmembrane transporter genes can tune intraclonal variation and may therefore be implicated in both reactive and anticipatory strategies of adaptation.
View details for DOI 10.1038/msb.2013.53
View details for PubMedID 24104478
RNA Polymerase II Collision Interrupts Convergent Transcription
2012; 48 (3): 365-374
Antisense noncoding transcripts, genes-within-genes, and convergent gene pairs are prevalent among eukaryotes. The existence of such transcription units raises the question of what happens when RNA polymerase II (RNAPII) molecules collide head-to-head. Here we use a combination of biochemical and genetic approaches in yeast to show that polymerases transcribing opposite DNA strands cannot bypass each other. RNAPII stops but does not dissociate upon head-to-head collision in vitro, suggesting that opposing polymerases represent insurmountable obstacles for each other. Head-to-head collision in vivo also results in RNAPII stopping, and removal of collided RNAPII from the DNA template can be achieved via ubiquitylation-directed proteolysis. Indeed, in cells lacking efficient RNAPII polyubiquitylation, the half-life of collided polymerases increases, so that they can be detected between convergent genes. These results provide insight into fundamental mechanisms of gene traffic control and point to an unexplored effect of antisense transcription on gene regulation via polymerase collision.
View details for DOI 10.1016/j.molce1.2012.08.027
View details for Web of Science ID 000311260900006
View details for PubMedID 23041286
Minimal regulatory spaces in yeast genomes
The regulatory information encoded in the DNA of promoter regions usually enforces a minimal, non-zero distance between the coding regions of neighboring genes. However, the size of this minimal regulatory space is not generally known. In particular, it is unclear if minimal promoter size differs between species and between uni- and bi-directionally acting regulatory regions.Analyzing the genomes of 11 yeasts, we show that the lower size limit on promoter-containing regions is species-specific within a relatively narrow range (80-255 bp). This size limit applies equally to regions that initiate transcription on one or both strands, indicating that bi-directional promoters and uni-directional promoters are constrained similarly. We further find that young, species-specific regions are on average much longer than older regions, suggesting either a bias towards deletions or selection for genome compactness in yeasts. While the length evolution of promoter-less intergenic regions is well described by a simplistic, purely neutral model, regions containing promoters typically show an excess of unusually long regions. Regions flanked by divergently transcribed genes have a bi-modal length distribution, with short lengths found preferentially among older regions. These old, short regions likely harbor evolutionarily conserved bi-directionally active promoters. Surprisingly, some of the evolutionarily youngest regions in two of the eleven species (S. cerevisiae and K. waltii) are shorter than the lower limit observed in older regions.The minimal chromosomal space required for transcriptional regulation appears to be relatively similar across yeast species, and is the same for uni-directional and bi-directional promoters. New intergenic regions created by genome rearrangements tend to evolve towards the more narrow size distribution found among older regions.
View details for DOI 10.1186/1471-2164-12-320
View details for Web of Science ID 000292251700001
View details for PubMedID 21679449
Yeast Sen1 Helicase Protects the Genome from Transcription-Associated Instability
2011; 41 (1): 21-32
Sen1 of S. cerevisiae is a known component of the NRD complex implicated in transcription termination of nonpolyadenylated as well as some polyadenylated RNA polymerase II transcripts. We now show that Sen1 helicase possesses a wider function by restricting the occurrence of RNA:DNA hybrids that may naturally form during transcription, when nascent RNA hybridizes to DNA prior to its packaging into RNA protein complexes. These hybrids displace the nontranscribed strand and create R loop structures. Loss of Sen1 results in transient R loop accumulation and so elicits transcription-associated recombination. SEN1 genetically interacts with DNA repair genes, suggesting that R loop resolution requires proteins involved in homologous recombination. Based on these findings, we propose that R loop formation is a frequent event during transcription and a key function of Sen1 is to prevent their accumulation and associated genome instability.
View details for DOI 10.1016/j.molcel.2010.12.007
View details for Web of Science ID 000286692400006
View details for PubMedID 21211720
Ab-origin: An improved tool of heavy chain rearrangement analysis for human immunoglobulin
COMPUTATIONAL SCIENCE - ICCS 2007, PT 2, PROCEEDINGS
2007; 4488: 363-369
View details for Web of Science ID 000247062900052
Conserved genes in a path from commensalism to pathogenicity: comparative phylogenetic profiles of Staphylococcus epidermidis RP62A and ATCC12228
Staphylococcus epidermidis, long regarded as an innocuous commensal bacterium of the human skin, is the most frequent cause of nosocomial infections associated with implanted medical devices. This conditional pathogen provides a model of choice to study genome landmarks correlated with the transition between commensalism and pathogenicity. Traditional investigations stress differences in gene content. We focused on conserved genes that have accumulated small mutation differences during the transition.A comparison of strain ATCC12228, a non-biofilm forming, non-infection associated strain and strain RP62A, a methicillin-resistant biofilm clinical isolate, revealed consistent variation, mostly single-nucleotide polymorphisms (SNPs), in orthologous genes in addition to the previously investigated global changes in gene clusters. This polymorphism, scattered throughout the genome, may reveal genes that contribute to adaptation of the bacteria to different environmental stimuli, allowing them to shift from commensalism to pathogenicity. SNPs were detected in 931 pairs of orthologs with identical gene length, accounting for approximately 45% of the total pairs of orthologs. Assuming that non-synonymous mutations would mark recent evolution, and hence be associated to the onset of the pathogenic process, analysis of ratios of non-synonymous SNPs vs synonymous SNPs suggested hypotheses about possible pathogenicity determinants. The N/S ratios for virulence factors and surface proteins differed significantly from that of average SNPs. Of those gene pairs, 40 showed a disproportionate distribution of dN vs dS. Among those, the presence of the gene encoding methionine sulfoxide reductase suggested a possible involvement of reactive oxygen species. This led us to uncover that the infection associated strain was significantly more resistant to hydrogen peroxide and paraquat than the environmental strain. Some 16 genes of the list were of unknown function. We could suggest however that they were likely to belong to surface proteins or considered in priority as important for pathogenicity.Our study proposed a novel approach to identify genes involved in pathogenic processes and provided some insight about the molecular mechanisms leading a commensal inhabitant to become an invasive pathogen.
View details for DOI 10.1186/1471-2164-7-112
View details for Web of Science ID 000238544700001
View details for PubMedID 16684363
- Comparative analysis of whole-genome sequences of Streptococcus suis CHINESE SCIENCE BULLETIN 2006; 51 (10): 1199-1209