Bio

Academic Appointments


Research & Scholarship

Current Research and Scholarly Interests


We study the regulation and evolution of gene expression using a combination of experimental and computational approaches.

Our work brings together quantitative genetics, genomics, epigenetics, and evolutionary biology to achieve a deeper understanding of how genetic variation within and between species affects genome-wide gene expression and ultimately shapes the phenotypic diversity of life.

Teaching

2013-14 Courses


Postdoctoral Advisees


Graduate and Fellowship Programs


  • Biology (School of Humanities and Sciences) (Phd Program)
  • Biomedical Informatics (Phd Program)

Publications

Journal Articles


  • Gene expression drives local adaptation in humans GENOME RESEARCH Fraser, H. B. 2013; 23 (7): 1089-1096

    Abstract

    The molecular basis of adaptation-and, in particular, the relative roles of protein-coding versus gene expression changes-has long been the subject of speculation and debate. Recently, the genotyping of diverse human populations has led to the identification of many putative "local adaptations" that differ between populations. Here I show that these local adaptations are over 10-fold more likely to affect gene expression than amino acid sequence. In addition, a novel framework for identifying polygenic local adaptations detects recent positive selection on the expression levels of genes involved in UV radiation response, immune cell proliferation, and diabetes-related pathways. These results provide the first examples of polygenic gene expression adaptation in humans, as well as the first genome-scale support for the hypothesis that changes in gene expression have driven human adaptation.

    View details for DOI 10.1101/gr.152710.112

    View details for Web of Science ID 000321119900006

    View details for PubMedID 23539138

  • Polygenic cis-regulatory adaptaion in the evolution of yeast pathogenicity GENOME RESEARCH Fraser, H. B., Levy, S., Chavan, A., Shah, H. B., Perez, J. C., Zhou, Y., Siegal, M. L., Sinha, H. 2012; 22 (10): 1930-1939

    Abstract

    The acquisition of new genes, via horizontal transfer or gene duplication/diversification, has been the dominant mechanism thus far implicated in the evolution of microbial pathogenicity. In contrast, the role of many other modes of evolution--such as changes in gene expression regulation-remains unknown. A transition to a pathogenic lifestyle has recently taken place in some lineages of the budding yeast Saccharomyces cerevisiae. Here we identify a module of physically interacting proteins involved in endocytosis that has experienced selective sweeps for multiple cis-regulatory mutations that down-regulate gene expression levels in a pathogenic yeast. To test if these adaptations affect virulence, we created a panel of single-allele knockout strains whose hemizygous state mimics the genes' adaptive down-regulations, and measured their virulence in a mammalian host. Despite having no growth advantage in standard laboratory conditions, nearly all of the strains were more virulent than their wild-type progenitor, suggesting that these adaptations likely played a role in the evolution of pathogenicity. Furthermore, genetic variants at these loci were associated with clinical origin across 88 diverse yeast strains, suggesting the adaptations may have contributed to the virulence of a wide range of clinical isolates. We also detected pleiotropic effects of these adaptations on a wide range of morphological traits, which appear to have been mitigated by compensatory mutations at other loci. These results suggest that cis-regulatory adaptation can occur at the level of physically interacting modules and that one such polygenic adaptation led to increased virulence during the evolution of a pathogenic yeast.

    View details for DOI 10.1101/gr.134080.111

    View details for Web of Science ID 000309325900010

    View details for PubMedID 22645260

  • Evidence for widespread adaptive evolution of gene expression in budding yeast PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Fraser, H. B., Moses, A. M., Schadt, E. E. 2010; 107 (7): 2977-2982

    Abstract

    Changes in gene expression have been proposed to underlie many, or even most, adaptive differences between species. Despite the increasing acceptance of this view, only a handful of cases of adaptive gene expression evolution have been demonstrated. To address this discrepancy, we introduce a simple test for lineage-specific selection on gene expression. Applying the test to genome-wide gene expression data from the budding yeast Saccharomyces cerevisiae, we find that hundreds of gene expression levels have been subject to lineage-specific selection. Comparing these findings with independent population genetic evidence of selective sweeps suggests that this lineage-specific selection has resulted in recent sweeps at over a hundred genes, most of which led to increased transcript levels. Examination of the implicated genes revealed a specific biochemical pathway--ergosterol biosynthesis--where the expression of multiple genes has been subject to selection for reduced levels. In sum, these results suggest that adaptive evolution of gene expression is common in yeast, that regulatory adaptation can occur at the level of entire pathways, and that similar genome-wide scans may be possible in other species, including humans.

    View details for DOI 10.1073/pnas.0912245107

    View details for Web of Science ID 000274599500050

    View details for PubMedID 20133628

  • Ancient cis-regulatory constraints and the evolution of genome architecture TRENDS IN GENETICS Irimia, M., Maeso, I., Roy, S. W., Fraser, H. B. 2013; 29 (9): 521-528

    Abstract

    The order of genes along metazoan chromosomes has generally been thought to be largely random, with few implications for organismal function. However, two recent studies, reporting hundreds of pairs of genes that have remained linked in diverse metazoan species over hundreds of millions of years of evolution, suggest widespread functional implications for gene order. These associations appear to largely reflect cis-regulatory constraints, with either (i) multiple genes sharing transcriptional regulatory elements, or (ii) regulatory elements for a developmental gene being found within a neighboring 'bystander' gene (known as a genomic regulatory block). We discuss implications, questions raised, and new research directions arising from these studies, as well as evidence for similar phenomena in other eukaryotic groups.

    View details for DOI 10.1016/j.tig.2013.05.008

    View details for Web of Science ID 000324284000006

    View details for PubMedID 23791467

  • The molecular mechanism of a cis-regulatory adaptation in yeast. PLoS genetics Chang, J., Zhou, Y., Hu, X., Lam, L., Henry, C., Green, E. M., Kita, R., Kobor, M. S., Fraser, H. B. 2013; 9 (9)

    Abstract

    Despite recent advances in our ability to detect adaptive evolution involving the cis-regulation of gene expression, our knowledge of the molecular mechanisms underlying these adaptations has lagged far behind. Across all model organisms, the causal mutations have been discovered for only a handful of gene expression adaptations, and even for these, mechanistic details (e.g. the trans-regulatory factors involved) have not been determined. We previously reported a polygenic gene expression adaptation involving down-regulation of the ergosterol biosynthesis pathway in the budding yeast Saccharomyces cerevisiae. Here we investigate the molecular mechanism of a cis-acting mutation affecting a member of this pathway, ERG28. We show that the causal mutation is a two-base deletion in the promoter of ERG28 that strongly reduces the binding of two transcription factors, Sok2 and Mot3, thus abolishing their regulation of ERG28. This down-regulation increases resistance to a widely used antifungal drug targeting ergosterol, similar to mutations disrupting this pathway in clinical yeast isolates. The identification of the causal genetic variant revealed that the selection likely occurred after the deletion was already present at high frequency in the population, rather than when it was a new mutation. These results provide a detailed view of the molecular mechanism of a cis-regulatory adaptation, and underscore the importance of this view to our understanding of evolution at the molecular level.

    View details for DOI 10.1371/journal.pgen.1003813

    View details for PubMedID 24068973

  • Differences in enhancer activity in mouse and zebrafish reporter assays are often associated with changes in gene expression BMC GENOMICS Ariza-Cosano, A., Visel, A., Pennacchio, L. A., Fraser, H. B., Luis Gomez-Skarmeta, J., Irimia, M., Bessa, J. 2012; 13

    Abstract

    Phenotypic evolution in animals is thought to be driven in large part by differences in gene expression patterns, which can result from sequence changes in cis-regulatory elements (cis-changes) or from changes in the expression pattern or function of transcription factors (trans-changes). While isolated examples of trans-changes have been identified, the scale of their overall contribution to regulatory and phenotypic evolution remains unclear.Here, we attempt to examine the prevalence of trans-effects and their potential impact on gene expression patterns in vertebrate evolution by comparing the function of identical human tissue-specific enhancer sequences in two highly divergent vertebrate model systems, mouse and zebrafish. Among 47 human conserved non-coding elements (CNEs) tested in transgenic mouse embryos and in stable zebrafish lines, at least one species-specific expression domain was observed in the majority (83%) of cases, and 36% presented dramatically different expression patterns between the two species. Although some of these discrepancies may be due to the use of different transgenesis systems in mouse and zebrafish, in some instances we found an association between differences in enhancer activity and changes in the endogenous gene expression patterns between mouse and zebrafish, suggesting a potential role for trans-changes in the evolution of gene expression.In total, our results: (i) serve as a cautionary tale for studies investigating the role of human enhancers in different model organisms, and (ii) suggest that changes in the trans environment may play a significant role in the evolution of gene expression in vertebrates.

    View details for DOI 10.1186/1471-2164-13-713

    View details for Web of Science ID 000313248200001

    View details for PubMedID 23253453

  • Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints GENOME RESEARCH Irimia, M., Tena, J. J., Alexis, M. S., Fernandez-Minan, A., Maeso, I., Bogdanovic, O., de la Calle-Mustienes, E., Roy, S. W., Gomez-Skarmeta, J. L., Fraser, H. B. 2012; 22 (12): 2356-2367

    Abstract

    The order of genes in eukaryotic genomes has generally been assumed to be neutral, since gene order is largely scrambled over evolutionary time. Only a handful of exceptional examples are known, typically involving deeply conserved clusters of tandemly duplicated genes (e.g., Hox genes and histones). Here we report the first systematic survey of microsynteny conservation across metazoans, utilizing 17 genome sequences. We identified nearly 600 pairs of unrelated genes that have remained tightly physically linked in diverse lineages across over 600 million years of evolution. Integrating sequence conservation, gene expression data, gene function, epigenetic marks, and other genomic features, we provide extensive evidence that many conserved ancient linkages involve (1) the coordinated transcription of neighboring genes, or (2) genomic regulatory blocks (GRBs) in which transcriptional enhancers controlling developmental genes are contained within nearby bystander genes. In addition, we generated ChIP-seq data for key histone modifications in zebrafish embryos, which provided further evidence of putative GRBs in embryonic development. Finally, using chromosome conformation capture (3C) assays and stable transgenic experiments, we demonstrate that enhancers within bystander genes drive the expression of genes such as Otx and Islet, critical regulators of central nervous system development across bilaterians. These results suggest that ancient genomic functional associations are far more common than previously thought-involving ?12% of the ancestral bilaterian genome-and that cis-regulatory constraints are crucial in determining metazoan genome architecture.

    View details for DOI 10.1101/gr.139725.112

    View details for Web of Science ID 000311895500004

    View details for PubMedID 22722344

  • Factors underlying variable DNA methylation in a human community cohort PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Lam, L. L., Emberly, E., Fraser, H. B., Neumann, S. M., Chen, E., Miller, G. E., Kobor, M. S. 2012; 109: 17253-17260

    Abstract

    Epigenetics is emerging as an attractive mechanism to explain the persistent genomic embedding of early-life experiences. Tightly linked to chromatin, which packages DNA into chromosomes, epigenetic marks primarily serve to regulate the activity of genes. DNA methylation is the most accessible and characterized component of the many chromatin marks that constitute the epigenome, making it an ideal target for epigenetic studies in human populations. Here, using peripheral blood mononuclear cells collected from a community-based cohort stratified for early-life socioeconomic status, we measured DNA methylation in the promoter regions of more than 14,000 human genes. Using this approach, we broadly assessed and characterized epigenetic variation, identified some of the factors that sculpt the epigenome, and determined its functional relation to gene expression. We found that the leukocyte composition of peripheral blood covaried with patterns of DNA methylation at many sites, as did demographic factors, such as sex, age, and ethnicity. Furthermore, psychosocial factors, such as perceived stress, and cortisol output were associated with DNA methylation, as was early-life socioeconomic status. Interestingly, we determined that DNA methylation was strongly correlated to the ex vivo inflammatory response of peripheral blood mononuclear cells to stimulation with microbial products that engage Toll-like receptors. In contrast, our work found limited effects of DNA methylation marks on the expression of associated genes across individuals, suggesting a more complex relationship than anticipated.

    View details for DOI 10.1073/pnas.1121249109

    View details for Web of Science ID 000310510500018

    View details for PubMedID 23045638

  • Population-specificity of human DNA methylation GENOME BIOLOGY Fraser, H. B., Lam, L. L., Neumann, S. M., Kobor, M. S. 2012; 13 (2)

    Abstract

    Ethnic differences in human DNA methylation have been shown for a number of CpG sites, but the genome-wide patterns and extent of these differences are largely unknown. In addition, whether the genetic control of polymorphic DNA methylation is population-specific has not been investigated.Here we measure DNA methylation near the transcription start sites of over 14, 000 genes in 180 cell lines derived from one African and one European population. We find population-specific patterns of DNA methylation at over a third of all genes. Furthermore, although the methylation at over a thousand CpG sites is heritable, these heritabilities also differ between populations, suggesting extensive divergence in the genetic control of DNA methylation. In support of this, genetic mapping of DNA methylation reveals that most of the population specificity can be explained by divergence in allele frequencies between populations, and that there is little overlap in genetic associations between populations. These population-specific genetic associations are supported by the patterns of DNA methylation in several hundred brain samples, suggesting that they hold in vivo and across tissues.These results suggest that DNA methylation is highly divergent between populations, and that this divergence may be due in large part to a combination of differences in allele frequencies and complex epistasis or gene × environment interactions.

    View details for DOI 10.1186/gb-2012-13-2-r8

    View details for Web of Science ID 000305391700001

    View details for PubMedID 22322129

  • Genome-wide approaches to the study of adaptive gene expression evolution Systematic studies of evolutionary adaptations involving gene expression will allow many fundamental questions in evolutionary biology to be addressed BIOESSAYS Fraser, H. B. 2011; 33 (6): 469-477

    Abstract

    The role of gene expression in evolutionary adaptation has been a subject of debate for over 40 years. cis-regulation of transcription has been proposed to be the primary source of morphological novelty in evolution, though this is based on only a handful of examples. Recently the first genome-wide studies of gene expression adaptation have been published, giving us an initial global view of this process. Systematic studies such as these will allow a number of key questions currently facing the field of gene expression evolution to be addressed.

    View details for DOI 10.1002/bies.201000094

    View details for Web of Science ID 000291548300012

    View details for PubMedID 21538412

  • Systematic Detection of Polygenic cis-Regulatory Evolution PLOS GENETICS Fraser, H. B., Babak, T., Tsang, J., Zhou, Y., Zhang, B., Mehrabian, M., Schadt, E. E. 2011; 7 (3)

    Abstract

    The idea that most morphological adaptations can be attributed to changes in the cis-regulation of gene expression levels has been gaining increasing acceptance, despite the fact that only a handful of such cases have so far been demonstrated. Moreover, because each of these cases involves only one gene, we lack any understanding of how natural selection may act on cis-regulation across entire pathways or networks. Here we apply a genome-wide test for selection on cis-regulation to two subspecies of the mouse Mus musculus. We find evidence for lineage-specific selection at over 100 genes involved in diverse processes such as growth, locomotion, and memory. These gene sets implicate candidate genes that are supported by both quantitative trait loci and a validated causality-testing framework, and they predict a number of phenotypic differences, which we confirm in all four cases tested. Our results suggest that gene expression adaptation is widespread and that these adaptations can be highly polygenic, involving cis-regulatory changes at numerous functionally related genes. These coordinated adaptations may contribute to divergence in a wide range of morphological, physiological, and behavioral phenotypes.

    View details for DOI 10.1371/journal.pgen.1002023

    View details for Web of Science ID 000288996600053

    View details for PubMedID 21483757

  • Genetic validation of whole-transcriptome sequencing for mapping expression affected by cis-regulatory variation BMC GENOMICS Babak, T., Garrett-Engele, P., Armour, C. D., Raymond, C. K., Keller, M. P., Chen, R., Rohl, C. A., Johnson, J. M., Attie, A. D., Fraser, H. B., Schadt, E. E. 2010; 11

    Abstract

    Identifying associations between genotypes and gene expression levels using microarrays has enabled systematic interrogation of regulatory variation underlying complex phenotypes. This approach has vast potential for functional characterization of disease states, but its prohibitive cost, given hundreds to thousands of individual samples from populations have to be genotyped and expression profiled, has limited its widespread application.Here we demonstrate that genomic regions with allele-specific expression (ASE) detected by sequencing cDNA are highly enriched for cis-acting expression quantitative trait loci (cis-eQTL) identified by profiling of 500 animals in parallel, with up to 90% agreement on the allele that is preferentially expressed. We also observed widespread noncoding and antisense ASE and identified several allele-specific alternative splicing variants.Monitoring ASE by sequencing cDNA from as little as one sample is a practical alternative to expression genetics for mapping cis-acting variation that regulates RNA transcription and processing.

    View details for DOI 10.1186/1471-2164-11-473

    View details for Web of Science ID 000282789200002

    View details for PubMedID 20707912

  • The Quantitative Genetics of Phenotypic Robustness PLOS ONE Fraser, H. B., Schadt, E. E. 2010; 5 (1)

    Abstract

    Phenotypic robustness, or canalization, has been extensively investigated both experimentally and theoretically. However, it remains unknown to what extent robustness varies between individuals, and whether factors buffering environmental variation also buffer genetic variation. Here we introduce a quantitative genetic approach to these issues, and apply this approach to data from three species. In mice, we find suggestive evidence that for hundreds of gene expression traits, robustness is polymorphic and can be genetically mapped to discrete genomic loci. Moreover, we find that the polymorphisms buffering genetic variation are distinct from those buffering environmental variation. In fact, these two classes have quite distinct mechanistic bases: environmental buffers of gene expression are predominantly sex-specific and trans-acting, whereas genetic buffers are not sex-specific and often cis-acting. Data from studies of morphological and life-history traits in plants and yeast support the distinction between polymorphisms buffering genetic and environmental variation, and further suggest that loci buffering different types of environmental variation do overlap with one another. These preliminary results suggest that naturally occurring polymorphisms affecting phenotypic robustness could be abundant, and that these polymorphisms may generally buffer either genetic or environmental variation, but not both.

    View details for DOI 10.1371/journal.pone.0008635

    View details for Web of Science ID 000273414200013

    View details for PubMedID 20072615

  • Common polymorphic transcript variation in human disease GENOME RESEARCH Fraser, H. B., Xie, X. 2009; 19 (4): 567-575

    Abstract

    Most human genes are thought to express different transcript isoforms in different cell types; however, the full extent and functional consequences of polymorphic transcript variation (PTV), which differ between individuals within the same cell type, are unknown. Here we show that PTV is widespread in B-cells from two human populations. Tens of thousands of exons were found to be polymorphically expressed in a heritable fashion, and over 1000 of these showed strong correlations with single nucleotide polymorphism (SNP) genotypes in cis. The SNPs associated with PTV display signs of having been subject to recent positive selection in humans, and they are also highly enriched for SNPs implicated by recent genome-wide association studies of four autoimmune diseases. From this disease-association overlap, we infer that PTV is the likely mechanism by which eight common polymorphisms contribute to disease risk. A catalog of PTV will be a valuable resource for interpreting results from future disease-association studies and understanding the spectrum of phenotypic differences among humans.

    View details for DOI 10.1101/gr.083477.108

    View details for Web of Science ID 000264781900005

    View details for PubMedID 19189928

  • Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Yassour, M., Kapian, T., Fraser, H. B., Levin, J. Z., Pfiffner, J., Adiconis, X., Schroth, G., Luo, S., Khrebtukova, I., Gnirke, A., Nusbaum, C., Thompson, D., Friedman, N., Regev, A. 2009; 106 (9): 3264-3269

    Abstract

    Defining the transcriptome, the repertoire of transcribed regions encoded in the genome, is a challenging experimental task. Current approaches, relying on sequencing of ESTs or cDNA libraries, are expensive and labor-intensive. Here, we present a general approach for ab initio discovery of the complete transcriptome of the budding yeast, based only on the unannotated genome sequence and millions of short reads from a single massively parallel sequencing run. Using novel algorithms, we automatically construct a highly accurate transcript catalog. Our approach automatically and fully defines 86% of the genes expressed under the given conditions, and discovers 160 previously undescribed transcription units of 250 bp or longer. It correctly demarcates the 5' and 3' UTR boundaries of 86 and 77% of expressed genes, respectively. The method further identifies 83% of known splice junctions in expressed genes, and discovers 25 previously uncharacterized introns, including 2 cases of condition-dependent intron retention. Our framework is applicable to poorly understood organisms, and can lead to greater understanding of the transcribed elements in an explored genome.

    View details for DOI 10.1073/pnas.0812841106

    View details for Web of Science ID 000263844100053

    View details for PubMedID 19208812

  • Confirmation of organized modularity in the yeast interactome PLOS BIOLOGY Bertin, N., Simonis, N., Dupuy, D., Cusick, M. E., Han, J. J., Fraser, H. B., Roth, F. P., Vidal, M. 2007; 5 (6): 1206-1210

    View details for DOI 10.1371/journal.pbio.0050153

    View details for Web of Science ID 000247173200005

    View details for PubMedID 17564493

  • Assessing the determinants of evolutionary rates in the presence of noise MOLECULAR BIOLOGY AND EVOLUTION Plotkin, J. B., Fraser, H. B. 2007; 24 (5): 1113-1121

    Abstract

    Although protein sequences are known to evolve at vastly different rates, little is known about what determines their rate of evolution. However, a recent study using principal component regression (PCR) has concluded that evolutionary rates in yeast are primarily governed by a single determinant related to translation frequency. Here, we demonstrate that noise in biological data can confound PCRs, leading to spurious conclusions. When equalizing noise levels across 7 predictor variables used in previous studies, we find no evidence that protein evolution is dominated by a single determinant. Our results indicate that a variety of factors--including expression level, gene dispensability, and protein-protein interactions--may independently affect evolutionary rates in yeast. More accurate measurements or more sophisticated statistical techniques will be required to determine which one, if any, of these factors dominates protein evolution.

    View details for DOI 10.1093/molbev/msm044

    View details for Web of Science ID 000246802400004

    View details for PubMedID 17347158

  • Using protein complexes to predict phenotypic effects of gene mutation GENOME BIOLOGY Fraser, H. B., Plotkin, J. B. 2007; 8 (11)

    Abstract

    Predicting the phenotypic effects of mutations is a central goal of genetics research; it has important applications in elucidating how genotype determines phenotype and in identifying human disease genes.Using a wide range of functional genomic data from the yeast Saccharomyces cerevisiae, we show that the best predictor of a protein's knockout phenotype is the knockout phenotype of other proteins that are present in a protein complex with it. Even the addition of multiple datasets does not improve upon the predictions made from protein complex membership. Similarly, we find that a proxy for protein complexes is a powerful predictor of disease phenotypes in humans.We propose that identifying human protein complexes containing known disease genes will be an efficient method for large-scale disease gene discovery, and that yeast may prove to be an informative model system for investigating, and even predicting, the genetic basis of both Mendelian and complex disease phenotypes.

    View details for DOI 10.1186/gb-2007-8-11-r252

    View details for Web of Science ID 000252101100026

    View details for PubMedID 18042286

  • Coevolution, modularity and human disease CURRENT OPINION IN GENETICS & DEVELOPMENT Fraser, H. B. 2006; 16 (6): 637-644

    Abstract

    The concepts of coevolution and modularity have been studied separately for decades. Recent advances in genomics have led to the first systematic studies in each of these fields at the molecular level, resulting in several important discoveries. Both coevolution and modularity appear to be pervasive features of genomic data from all species studied to date, and their presence can be detected in many types of datasets, including genome sequences, gene expression data, and protein-protein interaction data. Moreover, the combination of these two ideas might have implications for our understanding of many aspects of biology, ranging from the general architecture of living systems to the causes of various human diseases.

    View details for DOI 10.1016/j.gde.2006.09.001

    View details for Web of Science ID 000242647400016

    View details for PubMedID 17005391

  • Codon usage and selection on proteins JOURNAL OF MOLECULAR EVOLUTION Plotkin, J. B., Dushoff, J., Desai, M. M., Fraser, H. B. 2006; 63 (5): 635-653

    Abstract

    Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed volatility, to estimate selection pressures on proteins on the basis of their synonymous codon usage (Plotkin and Dushoff 2003; Plotkin et al. 2004). Here we provide a theoretical foundation for this approach. Under the Fisher-Wright model, we derive the expected frequencies of synonymous codons as a function of the strength of selection on amino acids, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. We find that synonymous codon usage can reliably distinguish between negative selection and neutrality only for organisms, such as some microbes, that experience large effective population sizes or periods of elevated mutation rates. The power of volatility to detect positive selection is also modest--requiring approximately 100 selected sites--but it depends less strongly on population size. We show that phenomena such as transient hyper-mutators can improve the power of volatility to detect selection, even when the neutral site heterozygosity is low. We also discuss several confounding factors, neglected by the Fisher-Wright model, that may limit the applicability of volatility in practice.

    View details for DOI 10.1007/s00239-005-0233-x

    View details for Web of Science ID 000242014800006

    View details for PubMedID 17043750

  • Estimating selection pressures from limited comparative data MOLECULAR BIOLOGY AND EVOLUTION Plotkin, J. B., Dushoff, J., Desai, M. M., Fraser, H. B. 2006; 23 (8): 1457-1459

    Abstract

    We recently introduced a novel method for estimating selection pressures on proteins, termed "volatility," which requires only a single genome sequence. Some criticisms that have been levied against this approach are valid, but many others are based on misconceptions of volatility, or they apply equally to comparative methods of estimating selection. Here, we introduce a simple regression technique for estimating selection pressures on all proteins in a genome, on the basis of limited comparative data. The regression technique does not depend on an underlying population-genetic mechanism. This new approach to estimating selection across a genome should be more powerful and more widely applicable than volatility itself.

    View details for DOI 10.1093/molberv/msl021

    View details for Web of Science ID 000239281200001

    View details for PubMedID 16754640

  • Aging and gene expression in the primate brain PLOS BIOLOGY Fraser, H. B., Khaitovich, P., Plotkin, J. B., Paabo, S., Eisen, M. B. 2005; 3 (9): 1653-1661

    Abstract

    It is well established that gene expression levels in many organisms change during the aging process, and the advent of DNA microarrays has allowed genome-wide patterns of transcriptional changes associated with aging to be studied in both model organisms and various human tissues. Understanding the effects of aging on gene expression in the human brain is of particular interest, because of its relation to both normal and pathological neurodegeneration. Here we show that human cerebral cortex, human cerebellum, and chimpanzee cortex each undergo different patterns of age-related gene expression alterations. In humans, many more genes undergo consistent expression changes in the cortex than in the cerebellum; in chimpanzees, many genes change expression with age in cortex, but the pattern of changes in expression bears almost no resemblance to that of human cortex. These results demonstrate the diversity of aging patterns present within the human brain, as well as how rapidly genome-wide patterns of aging can evolve between species; they may also have implications for the oxidative free radical theory of aging, and help to improve our understanding of human neurodegenerative diseases.

    View details for DOI 10.1371/journal.pbio.0030274

    View details for Web of Science ID 000231820900016

    View details for PubMedID 16048372

  • Sum1p, the origin recognition complex, and the spreading of a promoter-specific repressor in Saccharomyces cerevisiae MOLECULAR AND CELLULAR BIOLOGY Lynch, P. J., Fraser, H. B., Sevastopoulos, E., Rine, J., Rusche, L. N. 2005; 25 (14): 5920-5932

    Abstract

    In Saccharomyces cerevisiae, Sum1p is a promoter-specific repressor. A single amino acid change generates the mutant Sum1-1p, which causes regional silencing at new loci where wild-type Sum1p does not act. Thus, Sum1-1p is a model for understanding how the spreading of repressive chromatin is regulated. When wild-type Sum1p was targeted to a locus where mutant Sum1-1p spreads, wild-type Sum1p did not spread as efficiently as mutant Sum1-1p did, despite being in the same genomic context. Thus, the SUM1-1 mutation altered the ability of the protein to spread. The spreading of Sum1-1p required both an enzymatically active deacetylase, Hst1p, and the N-terminal tail of histone H4, consistent with the spreading of Sum1-1p involving sequential modification of and binding to histone tails, as observed for other silencing proteins. Furthermore, deletion of the N-terminal tail of H4 caused Sum1-1p to return to loci where wild-type Sum1p acts, consistent with the SUM1-1 mutation increasing the affinity of the protein for H4 tails. These results imply that the spreading of repressive chromatin proteins is regulated by their affinities for histone tails. Finally, this study uncovered a functional connection between wild-type Sum1p and the origin recognition complex, and this relationship also contributes to mutant Sum1-1p localization.

    View details for DOI 10.1128/MCB.25.14.5920-5932.2005

    View details for Web of Science ID 000230267000012

    View details for PubMedID 15988008

  • Functional genomic analysis of the rates of protein evolution PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Wall, D. P., Hirsh, A. E., Fraser, H. B., Kumm, J., Giaever, G., Eisen, M. B., Feldman, M. W. 2005; 102 (15): 5483-5488

    Abstract

    The evolutionary rates of proteins vary over several orders of magnitude. Recent work suggests that analysis of large data sets of evolutionary rates in conjunction with the results from high-throughput functional genomic experiments can identify the factors that cause proteins to evolve at such dramatically different rates. To this end, we estimated the evolutionary rates of >3,000 proteins in four species of the yeast genus Saccharomyces and investigated their relationship with levels of expression and protein dispensability. Each protein's dispensability was estimated by the growth rate of mutants deficient for the protein. Our analyses of these improved evolutionary and functional genomic data sets yield three main results. First, dispensability and expression have independent, significant effects on the rate of protein evolution. Second, measurements of expression levels in the laboratory can be used to filter data sets of dispensability estimates, removing variates that are unlikely to reflect real biological effects. Third, structural equation models show that although we may reasonably infer that dispensability and expression have significant effects on protein evolutionary rate, we cannot yet accurately estimate the relative strengths of these effects.

    View details for DOI 10.1073/pnas.0501761102

    View details for Web of Science ID 000228376600036

    View details for PubMedID 15800036

  • Modularity and evolutionary constraint on proteins NATURE GENETICS Fraser, H. B. 2005; 37 (4): 351-352

    Abstract

    Modularity, which has been found in the functional and physical protein interaction networks of many organisms, has been postulated to affect both the mode and tempo of evolution. Here I show that in the yeast Saccharomyces cerevisiae, protein interaction hubs situated in single modules are highly constrained, whereas those connecting different modules are more plastic. This pattern of change could reflect a tendency for evolutionary innovations to occur by altering the proteins and interactions between rather than within modules, in a manner somewhat similar to the evolution of new proteins through the shuffling of conserved protein domains.

    View details for DOI 10.1038/ng1530

    View details for Web of Science ID 000228040000016

    View details for PubMedID 15750592

  • Adjusting for selection on synonymous sites in estimates of evolutionary distance MOLECULAR BIOLOGY AND EVOLUTION Hirsh, A. E., Fraser, H. B., Wall, D. P. 2005; 22 (1): 174-177

    Abstract

    Evolution at silent sites is often used to estimate the pace of selectively neutral processes or to infer differences in divergence times of genes. However, silent sites are subject to selection in favor of preferred codons, and the strength of such selection varies dramatically across genes. Here, we use the relationship between codon bias and synonymous divergence observed in four species of the genus Saccharomyces to provide a simple correction for selection on silent sites.

    View details for DOI 10.1093/molbev/msh265

    View details for Web of Science ID 000225730100018

    View details for PubMedID 15371530

  • Conservation and evolution of cis-regulatory systems in ascomycete fungi PLOS BIOLOGY Gasch, A. P., Moses, A. M., Chiang, D. Y., Fraser, H. B., Berardini, M., Eisen, M. B. 2004; 2 (12): 2202-2219

    Abstract

    Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups fromS. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa.

    View details for DOI 10.1371/journal.pbio.0020398

    View details for Web of Science ID 000226099600021

    View details for PubMedID 15534694

  • Coevolution of gene expression among interacting proteins PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Fraser, H. B., Hirsh, A. E., Wall, D. P., Eisen, M. B. 2004; 101 (24): 9033-9038

    Abstract

    Physically interacting proteins or parts of proteins are expected to evolve in a coordinated manner that preserves proper interactions. Such coevolution at the amino acid-sequence level is well documented and has been used to predict interacting proteins, domains, and amino acids. Interacting proteins are also often precisely coexpressed with one another, presumably to maintain proper stoichiometry among interacting components. Here, we show that the expression levels of physically interacting proteins coevolve. We estimate average expression levels of genes from four closely related fungi of the genus Saccharomyces using the codon adaptation index and show that expression levels of interacting proteins exhibit coordinated changes in these different species. We find that this coevolution of expression is a more powerful predictor of physical interaction than is coevolution of amino acid sequence. These results demonstrate that gene expression levels can coevolve, adding another dimension to the study of the coevolution of interacting proteins and underscoring the importance of maintaining coexpression of interacting proteins over evolutionary time. Our results also suggest that expression coevolution can be used for computational prediction of protein-protein interactions.

    View details for DOI 10.1073/pnas.0402591101

    View details for Web of Science ID 000222104900038

    View details for PubMedID 15175431

  • Noise minimization in eukaryotic gene expression PLOS BIOLOGY Fraser, H. B., Hirsh, A. E., Giaever, G., Kumm, J., Eisen, M. B. 2004; 2 (6): 834-838

    Abstract

    All organisms have elaborate mechanisms to control rates of protein production. However, protein production is also subject to stochastic fluctuations, or "noise." Several recent studies in Saccharomyces cerevisiae and Escherichia coli have investigated the relationship between transcription and translation rates and stochastic fluctuations in protein levels, or more generally, how such randomness is a function of intrinsic and extrinsic factors. However, the fundamental question of whether stochasticity in protein expression is generally biologically relevant has not been addressed, and it remains unknown whether random noise in the protein production rate of most genes significantly affects the fitness of any organism. We propose that organisms should be particularly sensitive to variation in the protein levels of two classes of genes: genes whose deletion is lethal to the organism and genes that encode subunits of multiprotein complexes. Using an experimentally verified model of stochastic gene expression in S. cerevisiae, we estimate the noise in protein production for nearly every yeast gene, and confirm our prediction that the production of essential and complex-forming proteins involves lower levels of noise than does the production of most other genes. Our results support the hypothesis that noise in gene expression is a biologically important variable, is generally detrimental to organismal fitness, and is subject to natural selection.

    View details for DOI 10.1371/journal.pbio.0020137

    View details for Web of Science ID 000222380400022

    View details for PubMedID 15124029

  • Evolutionary rate depends on number of protein-protein interactions independently of gene expression level BMC EVOLUTIONARY BIOLOGY Fraser, H. B., Hirsh, A. E. 2004; 4

    Abstract

    Whether or not a protein's number of physical interactions with other proteins plays a role in determining its rate of evolution has been a contentious issue. A recent analysis suggested that the observed correlation between number of interactions and evolutionary rate may be due to experimental biases in high-throughput protein interaction data sets.The number of interactions per protein, as measured by some protein interaction data sets, shows no correlation with evolutionary rate. Other data sets, however, do reveal a relationship. Furthermore, even when experimental biases of these data sets are taken into account, a real correlation between number of interactions and evolutionary rate appears to exist.A strong and significant correlation between a protein's number of interactions and evolutionary rate is apparent for interaction data from some studies. The extremely low agreement between different protein interaction data sets indicates that interaction data are still of low coverage and/or quality. These limitations may explain why some data sets reveal no correlation with evolutionary rates.

    View details for Web of Science ID 000222014000001

    View details for PubMedID 15165289

  • Detecting selection using a single genome sequence of M-tuberculosis and P-falciparum NATURE Plotkin, J. B., Dushoff, J., Fraser, H. B. 2004; 428 (6986): 942-945

    Abstract

    Selective pressures on proteins are usually measured by comparing nucleotide sequences. Here we introduce a method to detect selection on the basis of a single genome sequence. We catalogue the relative strength of selection on each gene in the entire genomes of Mycobacterium tuberculosis and Plasmodium falciparum. Our analysis confirms that most antigens are under strong selection for amino-acid substitutions, particularly the PE/PPE family of putative surface proteins in M. tuberculosis and the EMP1 family of cytoadhering surface proteins in P. falciparum. We also identify many uncharacterized proteins that are under strong selection in each pathogen. We provide a genome-wide analysis of natural selection acting on different stages of an organism's life cycle: genes expressed in the ring stage of P. falciparum are under stronger positive selection than those expressed in other stages of the parasite's life cycle. Our method of estimating selective pressures requires far fewer data than comparative sequence analysis, and it measures selection across an entire genome; the method can readily be applied to a large range of sequenced organisms.

    View details for DOI 10.1038/nature02458

    View details for Web of Science ID 000221083000041

    View details for PubMedID 15118727

  • Detecting putative orthologs BIOINFORMATICS Wall, D. P., Fraser, H. B., Hirsh, A. E. 2003; 19 (13): 1710-1711

    Abstract

    We developed an algorithm that improves upon the common procedure of taking reciprocal best blast hits(rbh) in the identification of orthologs. The method-reciprocal smallest distance algorithm (rsd)-relies on global sequence alignment and maximum likelihood estimation of evolutionary distances to detect orthologs between two genomes. rsd finds many putative orthologs missed by rbh because it is less likely than rbh to be misled by the presence of a close paralog.

    View details for DOI 10.1093/bioinformatics/btg213

    View details for Web of Science ID 000185310600016

    View details for PubMedID 15593400

  • A simple dependence between protein evolution rate and the number of protein-protein interactions BMC EVOLUTIONARY BIOLOGY Fraser, H. B., Wall, D. P., Hirsh, A. E. 2003; 3

    Abstract

    It has been shown for an evolutionarily distant genomic comparison that the number of protein-protein interactions a protein has correlates negatively with their rates of evolution. However, the generality of this observation has recently been challenged. Here we examine the problem using protein-protein interaction data from the yeast Saccharomyces cerevisiae and genome sequences from two other yeast species.In contrast to a previous study that used an incomplete set of protein-protein interactions, we observed a highly significant correlation between number of interactions and evolutionary distance to either Candida albicans or Schizosaccharomyces pombe. This study differs from the previous one in that it includes all known protein interactions from S. cerevisiae, and a larger set of protein evolutionary rates. In both evolutionary comparisons, a simple monotonic relationship was found across the entire range of the number of protein-protein interactions. In agreement with our earlier findings, this relationship cannot be explained by the fact that proteins with many interactions tend to be important to yeast. The generality of these correlations in other kingdoms of life unfortunately cannot be addressed at this time, due to the incompleteness of protein-protein interaction data from organisms other than S. cerevisiae.Protein-protein interactions tend to slow the rate at which proteins evolve. This may be due to structural constraints that must be met to maintain interactions, but more work is needed to definitively establish the mechanism(s) behind the correlations we have observed.

    View details for Web of Science ID 000188122100011

    View details for PubMedID 12769820

  • Evolutionary rate in the protein interaction network SCIENCE Fraser, H. B., Hirsh, A. E., Steinmetz, L. M., Scharfe, C., Feldman, M. W. 2002; 296 (5568): 750-752

    Abstract

    High-throughput screens have begun to reveal the protein interaction network that underpins most cellular functions in the yeast Saccharomyces cerevisiae. How the organization of this network affects the evolution of the proteins that compose it is a fundamental question in molecular evolution. We show that the connectivity of well-conserved proteins in the network is negatively correlated with their rate of evolution. Proteins with more interactors evolve more slowly not because they are more important to the organism, but because a greater proportion of the protein is directly involved in its function. At sites important for interaction between proteins, evolutionary changes may occur largely by coevolution, in which substitutions in one protein result in selection pressure for reciprocal changes in interacting partners. We confirm one predicted outcome of this process-namely, that interacting proteins evolve at similar rates.

    View details for Web of Science ID 000175281700060

    View details for PubMedID 11976460

  • Explaining mortality rate plateaus PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Weitz, J. S., Fraser, H. B. 2001; 98 (26): 15383-15386

    Abstract

    We propose a stochastic model of aging to explain deviations from exponential growth in mortality rates commonly observed in empirical studies. Mortality rate plateaus are explained as a generic consequence of considering death in terms of first passage times for processes undergoing a random walk with drift. Simulations of populations with age-dependent distributions of viabilities agree with a wide array of experimental results. The influence of cohort size is well accounted for by the stochastic nature of the model.

    View details for Web of Science ID 000172848800114

    View details for PubMedID 11752476

  • Protein dispensability and rate of evolution NATURE Hirsh, A. E., Fraser, H. B. 2001; 411 (6841): 1046-1049

    Abstract

    If protein evolution is due in large part to slightly deleterious amino acid substitutions, then the rate of evolution should be greater in proteins that contribute less to individual fitness. The rationale for this prediction is that relatively dispensable proteins should be subject to weaker purifying selection, and should therefore accumulate mildly deleterious substitutions more rapidly. Although this argument was presented over twenty years ago, and is fundamental to many applications of evolutionary theory, the prediction has proved difficult to confirm. In fact, a recent study showed that essential mouse genes do not evolve more slowly than non-essential ones. Thus, although a variety of factors influencing the rate of protein evolution have been supported by extensive sequence analysis, the relationship between protein dispensability and evolutionary rate has remained unconfirmed. Here we use the results from a highly parallel growth assay of single gene deletions in yeast to assess protein dispensability, which we relate to evolutionary rate estimates that are based on comparisons of sequences drawn from twenty-one fully annotated genomes. Our analysis reveals a highly significant relationship between protein dispensability and evolutionary rate, and explains why this relationship is not detectable by categorical comparison of essential versus non-essential proteins. The relationship is highly conserved, so that protein dispensability in yeast is also predictive of evolutionary rate in a nematode worm.

    View details for Web of Science ID 000169528500047

    View details for PubMedID 11429604

Stanford Medicine Resources: