The architecture of an empirical genotype-phenotype map.
Evolution; international journal of organic evolution
Metabolic Determinants of Enzyme Evolution in a Genome-Scale Bacterial Metabolic Network.
Genome biology and evolution
2018; 10 (11): 3076?88
Recent advances in high-throughput technologies are bringing the study of empirical genotype-phenotype (GP) maps to the fore. Here, we use data from protein binding microarrays to study an empirical GP map of transcription factor (TF) binding preferences. In this map, each genotype is a DNA sequence. The phenotype of this DNA sequence is its ability to bind one or more TFs. We study this GP map using genotype networks, in which nodes represent genotypes with the same phenotype, and edges connect nodes if their genotypes differ by a single small mutation. We describe the structure and arrangement of genotype networks within the space of all possible binding sites for 525 TFs from three eukaryotic species encompassing three kingdoms of life (animal, plant, and fungi). We thus provide a high-resolution depiction of the architecture of an empirical GP map. Among a number of findings, we show that these genotype networks are "small-world" and assortative, and that they ubiquitously overlap and interface with one another. We also use polymorphism data from Arabidopsis thaliana to show how genotype network structure influences the evolution of TF binding sites in vivo. We discuss our findings in the context of regulatory evolution. This article is protected by copyright. All rights reserved.
View details for DOI 10.1111/evo.13487
View details for PubMedID 29676774
A thousand empirical adaptive landscapes and their navigability.
Nature ecology & evolution
2017; 1 (2): 45
Different genes and proteins evolve at very different rates. To identify the factors that explain these differences is an important aspect of research in molecular evolution. One such factor is the role a protein plays in a large molecular network. Here, we analyze the evolutionary rates of enzyme-coding genes in the genome-scale metabolic network of Escherichia coli to find the evolutionary constraints imposed by the structure and function of this complex metabolic system. Central and highly connected enzymes appear to evolve more slowly than less connected enzymes, but we find that they do so as a by-product of their high abundance, and not because of their position in the metabolic network. In contrast, enzymes catalyzing reactions with high metabolic flux-high substrate to product conversion rates-evolve slowly even after we account for their abundance. Moreover, enzymes catalyzing reactions that are difficult to by-pass through alternative pathways, such that they are essential in many different genetic backgrounds, also evolve more slowly. Our analyses show that an enzyme's role in the function of a metabolic network affects its evolution more than its place in the network's structure. They highlight the value of a system-level perspective for studies of molecular evolution.
View details for DOI 10.1093/gbe/evy234
View details for PubMedID 30351420
High mutation rates limit evolutionary adaptation in Escherichia coli.
2018; 14 (4): e1007324
The adaptive landscape is an iconic metaphor that pervades evolutionary biology. It was mostly applied in theoretical models until recent years, when empirical data began to allow partial landscape reconstructions. Here, we exhaustively analyse 1,137 complete landscapes from 129 eukaryotic species, each describing the binding affinity of a transcription factor to all possible short DNA sequences. We find that the navigability of these landscapes through single mutations is intermediate to that of additive and shuffled null models, suggesting that binding affinity-and thereby gene expression-is readily fine-tuned via mutations in transcription factor binding sites. The landscapes have few peaks that vary in their accessibility and in the number of sequences they contain. Binding sites in the mouse genome are enriched in sequences found in the peaks of especially navigable landscapes and the genetic diversity of binding sites in yeast increases with the number of sequences in a peak. Our findings suggest that landscape navigability may have contributed to the enormous success of transcriptional regulation as a source of evolutionary adaptations and innovations.
View details for DOI 10.1038/s41559-016-0045
View details for PubMedID 28812623
Drosophila Nnf1 paralogs are partially redundant for somatic and germ line kinetochore function.
2017; 126 (1): 145-163
Mutation is fundamental to evolution, because it generates the genetic variation on which selection can act. In nature, genetic changes often increase the mutation rate in systems that range from viruses and bacteria to human tumors. Such an increase promotes the accumulation of frequent deleterious or neutral alleles, but it can also increase the chances that a population acquires rare beneficial alleles. Here, we study how up to 100-fold increases in Escherichia coli's genomic mutation rate affect adaptive evolution. To do so, we evolved multiple replicate populations of asexual E. coli strains engineered to have four different mutation rates for 3000 generations in the laboratory. We measured the ability of evolved populations to grow in their original environment and in more than 90 novel chemical environments. In addition, we subjected the populations to whole genome population sequencing. Although populations with higher mutation rates accumulated greater genetic diversity, this diversity conveyed benefits only for modestly increased mutation rates, where populations adapted faster and also thrived better than their ancestors in some novel environments. In contrast, some populations at the highest mutation rates showed reduced adaptation during evolution, and failed to thrive in all of the 90 alternative environments. In addition, they experienced a dramatic decrease in mutation rate. Our work demonstrates that the mutation rate changes the global balance between deleterious and beneficial mutational effects on fitness. In contrast to most theoretical models, our experiments suggest that this tipping point already occurs at the modest mutation rates that are found in the wild.
View details for DOI 10.1371/journal.pgen.1007324
View details for PubMedID 29702649
The Molecular Chaperone DnaK Is a Source of Mutational Robustness
GENOME BIOLOGY AND EVOLUTION
2016; 8 (9): 2979-2991
Kinetochores allow attachment of chromosomes to spindle microtubules. Moreover, they host proteins that permit correction of erroneous attachments and prevent premature anaphase onset before bi-orientation of all chromosomes in metaphase has been achieved. Kinetochores are assembled from subcomplexes. Kinetochore proteins as well as the underlying centromere proteins and the centromeric DNA sequences evolve rapidly despite their fundamental importance for faithful chromosome segregation during mitotic and meiotic divisions. During evolution of Drosophila melanogaster, several centromere proteins were lost and a recent gene duplication has resulted in two Nnf1 paralogs, Nnf1a and Nnf1b, which code for alternative forms of a Mis12 kinetochore complex component. The rapid evolutionary divergence of centromere/kinetochore constituents in animals and plants has been proposed to be driven by an intragenome conflict resulting from centromere drive during female meiosis. Thus, a female meiosis-specific paralog might be expected to evolve rapidly under positive selection. While our characterization of the D. melanogaster Nnf1 paralogs hints at some partial functional specialization of Nnf1b for meiosis, we have failed to detect evidence for positive selection in our analysis of Nnf1 sequence evolution in the Drosophilid lineage. Neither paralog is essential, even though we find some clear differences in subcellular localization and expression during development. Loss of both paralogs results in developmental lethality. We therefore conclude that the two paralogs are still in early stages of differentiation.
View details for DOI 10.1007/s00412-016-0579-4
View details for PubMedID 26892014
Genonets server-a web server for the construction, analysis and visualization of genotype networks
NUCLEIC ACIDS RESEARCH
2016; 44 (W1): W70-W76
Molecular chaperones, also known as heat-shock proteins, refold misfolded proteins and help other proteins reach their native conformation. Thanks to these abilities, some chaperones, such as the Hsp90 protein or the chaperonin GroEL, can buffer the deleterious phenotypic effects of mutations that alter protein structure and function. Hsp70 chaperones use a chaperoning mechanism different from that of Hsp90 and GroEL, and it is not known whether they can also buffer mutations. Here, we show that they can. To this end, we performed a mutation accumulation experiment in Escherichia coli, followed by whole-genome resequencing. Overexpression of the Hsp70 chaperone DnaK helps cells cope with mutational load and completely avoid the extinctions we observe in lineages evolving without chaperone overproduction. Additionally, our sequence data show that DnaK overexpression increases mutational robustness, the tolerance of its clients to nonsynonymous nucleotide substitutions. We also show that this elevated mutational buffering translates into differences in evolutionary rates on intermediate and long evolutionary time scales. Specifically, we studied the evolutionary rates of DnaK clients using the genomes of E. coli, Salmonella enterica, and 83 other gamma-proteobacteria. We find that clients that interact strongly with DnaK evolve faster than weakly interacting clients. Our results imply that all three major chaperone classes can buffer mutations and affect protein evolution. They illustrate how an individual protein like a chaperone can have a disproportionate effect on the evolution of a proteome.
View details for DOI 10.1093/gbe/evw176
View details for Web of Science ID 000386122800005
View details for PubMedID 27497316
View details for PubMedCentralID PMC5630943
How Archiving by Freezing Affects the Genome-Scale Diversity of Escherichia coli Populations
GENOME BIOLOGY AND EVOLUTION
2016; 8 (5): 1290-1298
A genotype network is a graph in which vertices represent genotypes that have the same phenotype. Edges connect vertices if their corresponding genotypes differ in a single small mutation. Genotype networks are used to study the organization of genotype spaces. They have shed light on the relationship between robustness and evolvability in biological systems as different as RNA macromolecules and transcriptional regulatory circuits. Despite the importance of genotype networks, no tool exists for their automatic construction, analysis and visualization. Here we fill this gap by presenting the Genonets Server, a tool that provides the following features: (i) the construction of genotype networks for categorical and univariate phenotypes from DNA, RNA, amino acid or binary sequences; (ii) analyses of genotype network topology and how it relates to robustness and evolvability, as well as analyses of genotype network topography and how it relates to the navigability of a genotype network via mutation and natural selection; (iii) multiple interactive visualizations that facilitate exploratory research and education. The Genonets Server is freely available at http://ieu-genonets.uzh.ch.
View details for DOI 10.1093/nar/gkw313
View details for Web of Science ID 000379786800013
View details for PubMedID 27106055
View details for PubMedCentralID PMC4987894
The SIB Swiss Institute of Bioinformatics' resources: focus on curated databases
NUCLEIC ACIDS RESEARCH
2016; 44 (D1): D27-D37
In the experimental evolution of microbes such as Escherichia coli, many replicate populations are evolved from a common ancestor. Freezing a population sample supplemented with the cryoprotectant glycerol permits later analysis or restarting of an evolution experiment. Typically, each evolving population, and thus each sample archived in this way, consists of many unique genotypes and phenotypes. The effect of archiving on such a heterogeneous population is unknown. Here, we identified optimal archiving conditions for E. coli. We also used genome sequencing of archived samples to study the effects that archiving has on genomic population diversity. We observed no allele substitutions and mostly small changes in allele frequency. Nevertheless, principal component analysis of genome-scale allelic diversity shows that archiving affects diversity across many loci. We showed that this change in diversity is due to selection rather than drift. In addition, ?1% of rare alleles that occurred at low frequencies were lost after treatment. Our observations imply that archived populations may be used to conduct fitness or other phenotypic assays of populations, in which the loss of a rare allele may have negligible effects. However, caution is appropriate when sequencing populations restarted from glycerol stocks, as well as when using glycerol stocks to restart or replay evolution. This is because the loss of rare alleles can alter the future evolutionary trajectory of a population if the lost alleles were strongly beneficial.
View details for DOI 10.1093/gbe/evw054
View details for Web of Science ID 000378633000001
View details for PubMedID 26988250
View details for PubMedCentralID PMC4898790
Fitness Trade-Offs Determine the Role of the Molecular Chaperonin GroEL in Buffering Mutations
MOLECULAR BIOLOGY AND EVOLUTION
2015; 32 (10): 2681-2693
The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB's Bioinformatics resource portal ExPASy features over 150 resources, including UniProtKB/Swiss-Prot, ENZYME, PROSITE, neXtProt, STRING, UniCarbKB, SugarBindDB, SwissRegulon, EPD, arrayMap, Bgee, SWISS-MODEL Repository, OMA, OrthoDB and other databases, which are briefly described in this article.
View details for DOI 10.1093/nar/gkv1310
View details for Web of Science ID 000371261700004
View details for PubMedID 26615188
View details for PubMedCentralID PMC4702916
Modelling the heart as a communication system.
Journal of the Royal Society, Interface
2015; 12 (105)
Molecular chaperones fold many proteins and their mutated versions in a cell and can sometimes buffer the phenotypic effect of mutations that affect protein folding. Unanswered questions about this buffering include the nature of its mechanism, its influence on the genetic variation of a population, the fitness trade-offs constraining this mechanism, and its role in expediting evolution. Answering these questions is fundamental to understand the contribution of buffering to increase genetic variation and ecological diversification. Here, we performed experimental evolution, genome resequencing, and computational analyses to determine the trade-offs and evolutionary trajectories of Escherichia coli expressing high levels of the essential chaperonin GroEL. GroEL is abundantly present in bacteria, particularly in bacteria with large loads of deleterious mutations, suggesting its role in mutational buffering. We show that groEL overexpression is costly to large populations evolving in the laboratory, leading to groE expression decline within 66 generations. In contrast, populations evolving under the strong genetic drift characteristic of endosymbiotic bacteria avoid extinction or can be rescued in the presence of abundant GroEL. Genomes resequenced from cells evolved under strong genetic drift exhibited significantly higher tolerance to deleterious mutations at high GroEL levels than at native levels, revealing that GroEL is buffering mutations in these cells. GroEL buffered mutations in a highly diverse set of proteins that interact with the environment, including substrate and ion membrane transporters, hinting at its role in ecological diversification. Our results reveal the fitness trade-offs of mutational buffering and how genetic variation is maintained in populations.
View details for DOI 10.1093/molbev/msv144
View details for Web of Science ID 000361987100016
View details for PubMedID 26116858
View details for PubMedCentralID PMC4576708
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0
NUCLEIC ACIDS RESEARCH
2011; 39: D70-D74
Electrical communication between cardiomyocytes can be perturbed during arrhythmia, but these perturbations are not captured by conventional electrocardiographic metrics. We developed a theoretical framework to quantify electrical communication using information theory metrics in two-dimensional cell lattice models of cardiac excitation propagation. The time series generated by each cell was coarse-grained to 1 when excited or 0 when resting. The Shannon entropy for each cell was calculated from the time series during four clinically important heart rhythms: normal heartbeat, anatomical reentry, spiral reentry and multiple reentry. We also used mutual information to perform spatial profiling of communication during these cardiac arrhythmias. We found that information sharing between cells was spatially heterogeneous. In addition, cardiac arrhythmia significantly impacted information sharing within the heart. Entropy localized the path of the drifting core of spiral reentry, which could be an optimal target of therapeutic ablation. We conclude that information theory metrics can quantitatively assess electrical communication among cardiomyocytes. The traditional concept of the heart as a functional syncytium sharing electrical information cannot predict altered entropy and information sharing during complex arrhythmia. Information theory metrics may find clinical application in the identification of rhythm-specific treatments which are currently unmet by traditional electrocardiographic techniques.
View details for DOI 10.1098/rsif.2014.1201
View details for PubMedID 25740854
View details for PubMedCentralID PMC4387519
Genome Sequence of the Pea Aphid Acyrthosiphon pisum
2010; 8 (2)
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org.
View details for DOI 10.1093/nar/gkq1061
View details for Web of Science ID 000285831700013
View details for PubMedID 21036865
View details for PubMedCentralID PMC3013669
Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems.
View details for DOI 10.1371/journal.pbio.1000313
View details for Web of Science ID 000275257300009
View details for PubMedID 20186266