Bio

Academic Appointments


Administrative Appointments


  • Advisory Board, WormBase, Caenorhabditis Genome Database (2002 - Present)
  • Advisory Board, FlyBase, Drosophila Genome Database (2007 - Present)

Professional Education


  • Ph.D., University of California, Molecular Biology (1985)
  • B.S., Purdue University, Biochemistry (1979)
  • B.S., Purdue University, Biological Sciences (1979)

Community and International Work


  • Stanford at The Tech, San Jose

    Topic

    Public Understanding of Genetics

    Partnering Organization(s)

    The Tech Museum of San Jose

    Location

    Bay Area

    Ongoing Project

    Yes

    Opportunities for Student Involvement

    Yes

Research & Scholarship

Current Research and Scholarly Interests


The Cherry lab is involved in identifying, validating and integrating scientific information into encyclopedic databases essential for investigation as well as scientific education. Published results of scientific experimentation are a foundation of our understanding of the natural world and provide motivation for new experiments. The combination of in-depth understanding reported in the literature with computational analyses is an essential ingredient of modern biological research. Mastery of the volumes of published literature requires comprehensive databases that provide the facts and underlying experimental data in publically accessible ways. Curation, extraction and sorting of factual experimental data from peer-reviewed journal articles is necessary to acquire these data from its source. Large quantitative datasets using global studies extend our knowledge of genes, their products and their interactions. By integrating quantitative datasets with curated focused experimental results creates unique comprehensive databases. My group creates such essential databases and makes them available to scientists and educators seeking to understand experimental results and to teach scientific knowledge.

The exploration of the genes and other important elements of a genome involve the use of previous results to aid the design of experiments that explore, for example, gene regulation, protein function, and interaction of these processes. New technologies are being applied to the determination of many molecular interactions of the components of chromosomes and the specific controls for the generation of the many cell types that create an organism from a single set of chromosomes. These methods create very large datasets that cannot be appreciated without computational methods and access to databases of scientific results.

The Cherry lab specializes in designing and managing a public database of information for the budding yeast Saccharomyces cerevisiae and have recently begun applying my expertise to human genomic information. Our current projects address three areas of research: engineering for the design of databases and software for the effective integration of complex experimental results; defining standards for eukaryotic genomic data that measure reliability and quality; and developing vocabularies that enhance communication between researchers, and between computational resources. This research involves the collection and standardization of experimental results and the detailed descriptions of these data into complex biological models, application of flexible search and retrieval tools, distribution of the integrated information for the acceleration of discovery.

Three major bioinformatics resources funded by the National Institutes of Health are provided by the lab. The Saccharomyces Genome Database project is the foremost database on a single organism. It is the archetype of all such databases because of its high quality, rich design, completeness, easy of use, and facilitation of scientific discovery. The Gene Ontology Consortium invented a structured vocabulary for the specification and description of gene function, their involvement in biological processes and their location within subcellular complexes and components. This innovative knowledgebase has unified biological nomenclature and is crucial for the analysis of biological results. The ENCODE Data Coordination Center provides an essential component for the analysis and use of large-scale studies of the human genome. Our work specifies the accurate and complete submission of human genomic experimental results, verifies the data quality, specifies and compiles the dataset experimental details, integrates data with existing human genome databases, distributed these results with its analyses via a portal that serves the diverse biomedical research community of skilled bioinformaticists, biologists, and educators.

Teaching

2014-15 Courses


Postdoctoral Advisees


Graduate and Fellowship Programs


Publications

Journal Articles


  • AGAPE (Automated Genome Analysis PipelinE) for Pan-Genome Analysis of Saccharomyces cerevisiae PLOS ONE Song, G., Dickins, B. J., Demeter, J., Engel, S., Dunn, B., Cherry, J. M. 2015; 10 (3)

    Abstract

    The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

    View details for DOI 10.1371/journal.pone.0120671

    View details for Web of Science ID 000351284600180

    View details for PubMedID 25781462

  • Gene Ontology Consortium: going forward NUCLEIC ACIDS RESEARCH Blake, J. A., Christie, K. R., Dolan, M. E., Drabkin, H. J., Hill, D. P., Ni, L., Sitnikov, D., Burgess, S., Buza, T., Gresham, C., McCarthy, F., Pillai, L., Wang, H., CARBON, S., Dietze, H., Lewis, S. E., Mungall, C. J., Munoz-Torres, M. C., Feuermann, M., Gaudet, P., Basu, S., Chisholm, R. L., Dodson, R. J., Fey, P., Mi, H., Thomas, P. D., Muruganujan, A., Poudel, S., Hu, J. C., Aleksander, S. A., McIntosh, B. K., Renfro, D. P., Siegele, D. A., Attrill, H., Brown, N. H., Tweedie, S., Lomax, J., Osumi-Sutherland, D., Parkinson, H., Roncaglia, P., Lovering, R. C., Talmud, P. J., Humphries, S. E., Denny, P., Campbell, N. H., Foulger, R. E., Chibucos, M. C., Giglio, M. G., Chang, H. Y., Finn, R., Fraser, M., Mitchell, A., Nuka, G., Pesseat, S., Sangrador, A., Scheremetjew, M., Young, S. Y., Stephan, R., Harris, M. A., Oliver, S. G., Rutherford, K., Wood, V., Bahler, J., Lock, A., Kersey, P. J., McDowall, M. D., Staines, D. M., Dwinell, M., Shimoyama, M., Laulederkind, S., Hayman, G. T., Wang, S. J., Petri, V., D'Eustachio, P., Matthews, L., Balakrishnan, R., Binkley, G., Cherry, J. M., Costanzo, M. C., Demeter, J., Dwight, S. S., Engel, S. R., Hitz, B. C., Inglis, D. O., Lloyd, P., Miyasato, S. R., Paskov, K., Roe, G., Simison, M., Nash, R. S., Skrzypek, M. S., Weng, S., Wong, E. D., Berardini, T. Z., Li, D., Huala, E., Argasinska, J., Arighi, C., Auchincloss, A., Axelsen, K., Argoud-Puy, G., Bateman, A., Bely, B., Blatter, M. C., Bonilla, C., Bougueleret, L., Boutet, E., Breuza, L., Bridge, A., Britto, R., Casals, C., Cibrian-Uhalte, E., Coudert, E., Cusin, I., Duek-Roggli, P., Estreicher, A., Famiglietti, L., Gane, P., Garmiri, P., Gos, A., Gruaz-Gumowski, N., Hatton-Ellis, E., Hinz, U., Hulo, C., Huntley, R., Jungo, F., Keller, G., Laiho, K., Lemercier, P., Lieberherr, D., MacDougall, A., Magrane, M., Martin, M., Masson, P., Mutowo, P., O'Donovan, C., Pedruzzi, I., Pichler, K., POGGIOLI, D., Poux, S., Rivoire, C., Roechert, B., Sawford, T., Schneider, M., Shypitsyna, A., Stutz, A., Sundaram, S., Tognolli, M., Wu, C., Xenarios, I., Chan, J., Kishore, R., Sternberg, P. W., Van Auken, K., Muller, H. M., Done, J., Li, Y., Howe, D., Westerfield, M. 2015; 43 (D1): D1049-D1056
  • RNAcentral: an international database of ncRNA sequences NUCLEIC ACIDS RESEARCH Petrov, A. I., Kay, S. J., Gibson, R., Kulesha, E., Staines, D., Bruford, E. A., Wright, M. W., Burge, S., Finn, R. D., Kersey, P. J., Cochrane, G., Bateman, A., Griffiths-Jones, S., Harrow, J., Chan, P. P., Lowe, T. M., Zwieb, C. W., Wower, J., Williams, K. P., Hudson, C. M., Gutell, R., Clark, M. B., Dinger, M., Quek, X. C., Bujnicki, J. M., Chua, N., Liu, J., Wang, H., Skogerbo, G., Zhao, Y., Chen, R., Zhu, W., Cole, J. R., Chai, B., Huang, H., Huang, H., Cherry, J. M., Hatzigeorgiou, A., Pruitt, K. D. 2015; 43 (D1): D123-D129

    View details for DOI 10.1093/nar/gku991

    View details for Web of Science ID 000350210400020

  • Ontology application and use at the ENCODE DCC. Database : the journal of biological databases and curation Malladi, V. S., Erickson, D. T., Podduturi, N. R., Rowe, L. D., Chan, E. T., Davidson, J. M., Hitz, B. C., Ho, M., Lee, B. T., Miyasato, S., Roe, G. R., Simison, M., Sloan, C. A., Strattan, J. S., Tanaka, F., Kent, W. J., Cherry, J. M., Hong, E. L. 2015; 2015

    Abstract

    The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a catalog of genomic annotations. To date, the project has generated over 4000 experiments across more than 350 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory network and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All ENCODE experimental data, metadata and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage and distribution to community resources and the scientific community. As the volume of data increases, the organization of experimental details becomes increasingly complicated and demands careful curation to identify related experiments. Here, we describe the ENCODE DCC's use of ontologies to standardize experimental metadata. We discuss how ontologies, when used to annotate metadata, provide improved searching capabilities and facilitate the ability to find connections within a set of experiments. Additionally, we provide examples of how ontologies are used to annotate ENCODE metadata and how the annotations can be identified via ontology-driven searches at the ENCODE portal. As genomic datasets grow larger and more interconnected, standardization of metadata becomes increasingly vital to allow for exploration and comparison of data between different scientific projects.

    View details for DOI 10.1093/database/bav010

    View details for PubMedID 25776021

  • The Reference Genome Sequence of Saccharomyces cerevisiae: Then and Now G3-GENES GENOMES GENETICS Engel, S. R., Dietrich, F. S., Fisk, D. G., Binkley, G., Balakrishnan, R., Costanzo, M. C., Dwight, S. S., Hitz, B. C., Karra, K., Nash, R. S., Weng, S., Wong, E. D., Lloyd, P., Skrzypek, M. S., Miyasato, S. R., Simison, M., Cherry, J. M. 2014; 4 (3): 389-398

    Abstract

    The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called "S288C 2010," was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science.

    View details for DOI 10.1534/g3.113.008995

    View details for Web of Science ID 000335751700002

    View details for PubMedID 24374639

  • Saccharomyces genome database provides new regulation data. Nucleic acids research Costanzo, M. C., Engel, S. R., Wong, E. D., Lloyd, P., Karra, K., Chan, E. T., Weng, S., Paskov, K. M., Roe, G. R., Binkley, G., Hitz, B. C., Cherry, J. M. 2014; 42 (Database issue): D717-25

    Abstract

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the community resource for genomic, gene and protein information about the budding yeast Saccharomyces cerevisiae, containing a variety of functional information about each yeast gene and gene product. We have recently added regulatory information to SGD and present it on a new tabbed section of the Locus Summary entitled 'Regulation'. We are compiling transcriptional regulator-target gene relationships, which are curated from the literature at SGD or imported, with permission, from the YEASTRACT database. For nearly every S. cerevisiae gene, the Regulation page displays a table of annotations showing the regulators of that gene, and a graphical visualization of its regulatory network. For genes whose products act as transcription factors, the Regulation page also shows a table of their target genes, accompanied by a Gene Ontology enrichment analysis of the biological processes in which those genes participate. We additionally synthesize information from the literature for each transcription factor in a free-text Regulation Summary, and provide other information relevant to its regulatory function, such as DNA binding site motifs and protein domains. All of the regulation data are available for querying, analysis and download via YeastMine, the InterMine-based data warehouse system in use at SGD.

    View details for DOI 10.1093/nar/gkt1158

    View details for PubMedID 24265222

  • A guide to best practices for Gene Ontology (GO) manual annotation DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION Balakrishnan, R., Harris, M. A., Huntley, R., Van Auken, K., Cherry, J. M. 2013

    Abstract

    The Gene Ontology Consortium (GOC) is a community-based bioinformatics project that classifies gene product function through the use of structured controlled vocabularies. A fundamental application of the Gene Ontology (GO) is in the creation of gene product annotations, evidence-based associations between GO definitions and experimental or sequence-based analysis. Currently, the GOC disseminates 126 million annotations covering >374 000 species including all the kingdoms of life. This number includes two classes of GO annotations: those created manually by experienced biocurators reviewing the literature or by examination of biological data (1.1 million annotations covering 2226 species) and those generated computationally via automated methods. As manual annotations are often used to propagate functional predictions between related proteins within and between genomes, it is critical to provide accurate consistent manual annotations. Toward this goal, we present here the conventions defined by the GOC for the creation of manual annotation. This guide represents the best practices for manual annotation as established by the GOC project over the past 12 years. We hope this guide will encourage research communities to annotate gene products of their interest to enhance the corpus of GO annotations available to all. Database URL: http://www.geneontology.org.

    View details for DOI 10.1093/database/bat054

    View details for Web of Science ID 000322067500001

    View details for PubMedID 23842463

  • InterMOD: integrated data and tools for the unification of model organism research SCIENTIFIC REPORTS Sullivan, J., Karra, K., Moxon, S. A., Vallejos, A., Motenko, H., Wong, J. D., Aleksic, J., Balakrishnan, R., Binkley, G., Harris, T., Hitz, B., Jayaraman, P., Lyne, R., Neuhauser, S., Pich, C., Smith, R. N., Quang Trinh, Q., Cherry, J. M., Richardson, J., Stein, L., Twigger, S., Westerfield, M., Worthey, E., Micklem, G. 2013; 3

    Abstract

    Model organisms are widely used for understanding basic biology, and have significantly contributed to the study of human disease. In recent years, genomic analysis has provided extensive evidence of widespread conservation of gene sequence and function amongst eukaryotes, allowing insights from model organisms to help decipher gene function in a wider range of species. The InterMOD consortium is developing an infrastructure based around the InterMine data warehouse system to integrate genomic and functional data from a number of key model organisms, leading the way to improved cross-species research. So far including budding yeast, nematode worm, fruit fly, zebrafish, rat and mouse, the project has set up data warehouses, synchronized data models, and created analysis tools and links between data from different species. The project unites a number of major model organism databases, improving both the consistency and accessibility of comparative research, to the benefit of the wider scientific community.

    View details for DOI 10.1038/srep01802

    View details for Web of Science ID 000318538700002

    View details for PubMedID 23652793

  • The new modern era of yeast genomics: community sequencing and the resulting annotation of multiple Saccharomyces cerevisiae strains at the Saccharomyces Genome Database DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION Engel, S. R., Cherry, J. M. 2013

    Abstract

    The first completed eukaryotic genome sequence was that of the yeast Saccharomyces cerevisiae, and the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the original model organism database. SGD remains the authoritative community resource for the S. cerevisiae reference genome sequence and its annotation, and continues to provide comprehensive biological information correlated with S. cerevisiae genes and their products. A diverse set of yeast strains have been sequenced to explore commercial and laboratory applications, and a brief history of those strains is provided. The publication of these new genomes has motivated the creation of new tools, and SGD will annotate and provide comparative analyses of these sequences, correlating changes with variations in strain phenotypes and protein function. We are entering a new era at SGD, as we incorporate these new sequences and make them accessible to the scientific community, all in an effort to continue in our mission of educating researchers and facilitating discovery.

    View details for DOI 10.1093/database/bat012

    View details for Web of Science ID 000316172400001

    View details for PubMedID 23487186

  • The YeastGenome app: the Saccharomyces Genome Database at your fingertips DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION Wong, E. D., Karra, K., Hitz, B. C., Hong, E. L., Cherry, J. M. 2013

    Abstract

    The Saccharomyces Genome Database (SGD) is a scientific database that provides researchers with high-quality curated data about the genes and gene products of Saccharomyces cerevisiae. To provide instant and easy access to this information on mobile devices, we have developed YeastGenome, a native application for the Apple iPhone and iPad. YeastGenome can be used to quickly find basic information about S. cerevisiae genes and chromosomal features regardless of internet connectivity. With or without network access, you can view basic information and Gene Ontology annotations about a gene of interest by searching gene names and gene descriptions or by browsing the database within the app to find the gene of interest. With internet access, the app provides more detailed information about the gene, including mutant phenotypes, references and protein and genetic interactions, as well as provides hyperlinks to retrieve detailed information by showing SGD pages and views of the genome browser. SGD provides online help describing basic ways to navigate the mobile version of SGD, highlights key features and answers frequently asked questions related to the app. The app is available from iTunes (http://itunes.com/apps/yeastgenome). The YeastGenome app is provided freely as a service to our community, as part of SGD's mission to provide free and open access to all its data and annotations.

    View details for DOI 10.1093/database/bat004

    View details for Web of Science ID 000316179800001

    View details for PubMedID 23396302

  • A gene ontology inferred from molecular networks NATURE BIOTECHNOLOGY Dutkowski, J., Kramer, M., Surma, M. A., Balakrishnan, R., Cherry, J. M., Krogan, N. J., Ideker, T. 2013; 31 (1): 38-?

    Abstract

    Ontologies have proven very useful for capturing knowledge as a hierarchy of terms and their interrelationships. In biology a major challenge has been to construct ontologies of gene function given incomplete biological knowledge and inconsistencies in how this knowledge is manually curated. Here we show that large networks of gene and protein interactions in Saccharomyces cerevisiae can be used to infer an ontology whose coverage and power are equivalent to those of the manually curated Gene Ontology (GO). The network-extracted ontology (NeXO) contains 4,123 biological terms and 5,766 term-term relations, capturing 58% of known cellular components. We also explore robust NeXO terms and term relations that were initially not cataloged in GO, a number of which have now been added based on our analysis. Using quantitative genetic interaction profiling and chemogenomics, we find further support for many of the uncharacterized terms identified by NeXO, including multisubunit structures related to protein trafficking or mitochondrial function. This work enables a shift from using ontologies to evaluate data to using data to construct and evaluate ontologies.

    View details for DOI 10.1038/nbt.2463

    View details for Web of Science ID 000313563600020

    View details for PubMedID 23242164

  • Gene Ontology Annotations and Resources NUCLEIC ACIDS RESEARCH Blake, J. A., Dolan, M., Drabkin, H., Hill, D. P., Ni, L., Sitnikov, D., Bridges, S., Burgess, S., Buza, T., McCarthy, F., Peddinti, D., Pillai, L., CARBON, S., Dietze, H., Ireland, A., Lewis, S. E., Mungall, C. J., Gaudet, P., Chisholm, R. L., Fey, P., Kibbe, W. A., Basu, S., Siegele, D. A., McIntosh, B. K., Renfro, D. P., Zweifel, A. E., Hu, J. C., Brown, N. H., Tweedie, S., Alam-Faruque, Y., Apweiler, R., Auchinchloss, A., Axelsen, K., Bely, B., Blatter, M., Bonilla, C., Bougueleret, L., Boutet, E., Breuza, L., Bridge, A., Chan, W. M., Chavali, G., Coudert, E., Dimmer, E., Estreicher, A., Famiglietti, L., Feuermann, M., Gos, A., Gruaz-Gumowski, N., Hieta, R., Hinz, U., Hulo, C., Huntley, R., James, J., Jungo, F., Keller, G., Laiho, K., Legge, D., Lemercier, P., Lieberherr, D., Magrane, M., Martin, M. J., Masson, P., Mutowo-Muellenet, P., O'Donovan, C., Pedruzzi, I., Pichler, K., POGGIOLI, D., Millan, P. P., Poux, S., Rivoire, C., Roechert, B., Sawford, T., Schneider, M., Stutz, A., Sundaram, S., Tognolli, M., Xenarios, I., Foulger, R., Lomax, J., Roncaglia, P., Khodiyar, V. K., Lovering, R. C., Talmud, P. J., Chibucos, M., Giglio, M. G., Chang, H., Hunter, S., McAnulla, C., Mitchell, A., Sangrador, A., Stephan, R., Harris, M. A., Oliver, S. G., Rutherford, K., Wood, V., Bahler, J., Lock, A., Kersey, P. J., McDowall, M. D., Staines, D. M., Dwinell, M., Shimoyama, M., Laulederkind, S., Hayman, T., Wang, S., Petri, V., Lowry, T., D'Eustachio, P., Matthews, L., Balakrishnan, R., Binkley, G., Cherry, J. M., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hitz, B. C., Hong, E. L., Karra, K., Miyasato, S. R., Nash, R. S., Park, J., Skrzypek, M. S., Weng, S., Wong, E. D., Berardini, T. Z., Li, D., Huala, E., Mi, H., Thomas, P. D., Chan, J., Kishore, R., Sternberg, P., Van Auken, K., Howe, D., Westerfield, M. 2013; 41 (D1): D530-D535
  • Annotation of functional variation in personal genomes using RegulomeDB GENOME RESEARCH Boyle, A. P., Hong, E. L., Hariharan, M., Cheng, Y., Schaub, M. A., Kasowski, M., Karczewski, K. J., Park, J., Hitz, B. C., Weng, S., Cherry, J. M., Snyder, M. 2012; 22 (9): 1790-1797

    Abstract

    As the sequencing of healthy and disease genomes becomes more commonplace, detailed annotation provides interpretation for individual variation responsible for normal and disease phenotypes. Current approaches focus on direct changes in protein coding genes, particularly nonsynonymous mutations that directly affect the gene product. However, most individual variation occurs outside of genes and, indeed, most markers generated from genome-wide association studies (GWAS) identify variants outside of coding segments. Identification of potential regulatory changes that perturb these sites will lead to a better localization of truly functional variants and interpretation of their effects. We have developed a novel approach and database, RegulomeDB, which guides interpretation of regulatory variants in the human genome. RegulomeDB includes high-throughput, experimental data sets from ENCODE and other sources, as well as computational predictions and manual annotations to identify putative regulatory potential and identify functional variants. These data sources are combined into a powerful tool that scores variants to help separate functional variants from a large pool and provides a small set of putative sites with testable hypotheses as to their function. We demonstrate the applicability of this tool to the annotation of noncoding variants from 69 full sequenced genomes as well as that of a personal genome, where thousands of functionally associated variants were identified. Moreover, we demonstrate a GWAS where the database is able to quickly identify the known associated functional variant and provide a hypothesis as to its function. Overall, we expect this approach and resource to be valuable for the annotation of human genome sequences.

    View details for DOI 10.1101/gr.137323.112

    View details for Web of Science ID 000308272800019

    View details for PubMedID 22955989

  • In the beginning there was babble ... AUTOPHAGY Klionsky, D. J., Bruford, E. A., Cherry, J. M., Hodgkin, J., Laulederkind, S. J., Singer, A. G. 2012; 8 (8): 1165-1167

    View details for DOI 10.4161/auto.20665

    View details for Web of Science ID 000308505200001

    View details for PubMedID 22836666

  • YeastMine-an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION Balakrishnan, R., Park, J., Karra, K., Hitz, B. C., Binkley, G., Hong, E. L., Sullivan, J., Micklem, G., Cherry, J. M. 2012

    Abstract

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) provides high-quality curated genomic, genetic, and molecular information on the genes and their products of the budding yeast Saccharomyces cerevisiae. To accommodate the increasingly complex, diverse needs of researchers for searching and comparing data, SGD has implemented InterMine (http://www.InterMine.org), an open source data warehouse system with a sophisticated querying interface, to create YeastMine (http://yeastmine.yeastgenome.org). YeastMine is a multifaceted search and retrieval environment that provides access to diverse data types. Searches can be initiated with a list of genes, a list of Gene Ontology terms, or lists of many other data types. The results from queries can be combined for further analysis and saved or downloaded in customizable file formats. Queries themselves can be customized by modifying predefined templates or by creating a new template to access a combination of specific data types. YeastMine offers multiple scenarios in which it can be used such as a powerful search interface, a discovery tool, a curation aid and also a complex database presentation format. DATABASE URL: http://yeastmine.yeastgenome.org.

    View details for DOI 10.1093/database/bar062

    View details for Web of Science ID 000304923700001

    View details for PubMedID 22434830

  • Considerations for creating and annotating the budding yeast Genome Map at SGD: a progress report DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION Chan, E. T., Cherry, J. M. 2012

    Abstract

    The Saccharomyces Genome Database (SGD) is compiling and annotating a comprehensive catalogue of functional sequence elements identified in the budding yeast genome. Recent advances in deep sequencing technologies have enabled for example, global analyses of transcription profiling and assembly of maps of transcription factor occupancy and higher order chromatin organization, at nucleotide level resolution. With this growing influx of published genome-scale data, come new challenges for their storage, display, analysis and integration. Here, we describe SGD's progress in the creation of a consolidated resource for genome sequence elements in the budding yeast, the considerations taken in its design and the lessons learned thus far. The data within this collection can be accessed at http://browse.yeastgenome.org and downloaded from http://downloads.yeastgenome.org. DATABASE URL: http://www.yeastgenome.org.

    View details for DOI 10.1093/database/bar057

    View details for Web of Science ID 000304922200001

    View details for PubMedID 22434826

  • CvManGO, a method for leveraging computational predictions to improve literature-based Gene Ontology annotations DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION Park, J., Costanzo, M. C., Balakrishnan, R., Cherry, J. M., Hong, E. L. 2012

    Abstract

    The set of annotations at the Saccharomyces Genome Database (SGD) that classifies the cellular function of S. cerevisiae gene products using Gene Ontology (GO) terms has become an important resource for facilitating experimental analysis. In addition to capturing and summarizing experimental results, the structured nature of GO annotations allows for functional comparison across organisms as well as propagation of functional predictions between related gene products. Due to their relevance to many areas of research, ensuring the accuracy and quality of these annotations is a priority at SGD. GO annotations are assigned either manually, by biocurators extracting experimental evidence from the scientific literature, or through automated methods that leverage computational algorithms to predict functional information. Here, we discuss the relationship between literature-based and computationally predicted GO annotations in SGD and extend a strategy whereby comparison of these two types of annotation identifies genes whose annotations need review. Our method, CvManGO (Computational versus Manual GO annotations), pairs literature-based GO annotations with computational GO predictions and evaluates the relationship of the two terms within GO, looking for instances of discrepancy. We found that this method will identify genes that require annotation updates, taking an important step towards finding ways to prioritize literature review. Additionally, we explored factors that may influence the effectiveness of CvManGO in identifying relevant gene targets to find in particular those genes that are missing literature-supported annotations, but our survey found that there are no immediately identifiable criteria by which one could enrich for these under-annotated genes. Finally, we discuss possible ways to improve this strategy, and the applicability of this method to other projects that use the GO for curation. DATABASE URL: http://www.yeastgenome.org.

    View details for DOI 10.1093/database/bas001

    View details for Web of Science ID 000304919800001

    View details for PubMedID 22434836

  • Saccharomyces Genome Database: the genomics resource of budding yeast NUCLEIC ACIDS RESEARCH Cherry, J. M., Hong, E. L., Amundsen, C., Balakrishnan, R., Binkley, G., Chan, E. T., Christie, K. R., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Karra, K., Krieger, C. J., Miyasato, S. R., Nash, R. S., Park, J., Skrzypek, M. S., Simison, M., Weng, S., Wong, E. D. 2012; 40 (D1): D700-D705
  • The Gene Ontology: enhancements for 2011 NUCLEIC ACIDS RESEARCH Blake, J. A., Dolan, M., Drabkin, H., Hill, D. P., Ni, L., Sitnikov, D., Burgess, S., Buza, T., Gresham, C., McCarthy, F., Pillai, L., Wang, H., CARBON, S., Lewis, S. E., Mungall, C. J., Gaudet, P., Chisholm, R. L., Fey, P., Kibbe, W. A., Basu, S., Siegele, D. A., McIntosh, B. K., Renfro, D. P., Zweifel, A. E., Hu, J. C., Brown, N. H., Tweedie, S., Alam-Faruque, Y., Apweiler, R., Auchinchloss, A., Axelsen, K., Argoud-Puy, G., Bely, B., Blatter, M., Bougueleret, L., Boutet, E., Branconi-Quintaje, S., Breuza, L., Bridge, A., Browne, P., Chan, W. M., Coudert, E., Cusin, I., Dimmer, E., Duek-Roggli, P., Eberhardt, R., Estreicher, A., Famiglietti, L., Ferro-Rojas, S., Feuermann, M., Gardner, M., Gos, A., Gruaz-Gumowski, N., Hinz, U., Hulo, C., Huntley, R., James, J., Jimenez, S., Jungo, F., Keller, G., Laiho, K., Legge, D., Lemercier, P., Lieberherr, D., Magrane, M., Martin, M. J., Masson, P., Moinat, M., O'Donovan, C., Pedruzzi, I., Pichler, K., POGGIOLI, D., Millan, P. P., Poux, S., Rivoire, C., Roechert, B., Sawford, T., Schneider, M., Sehra, H., Stanley, E., Stutz, A., Sundaram, S., Tognolli, M., Xenarios, I., Foulger, R., Lomax, J., Roncaglia, P., Camon, E., Khodiyar, V. K., Lovering, R. C., Talmud, P. J., Chibucos, M., Giglio, M. G., Dolinski, K., HEINICKE, S., Livstone, M. S., Stephan, R., Harris, M. A., Oliver, S. G., Rutherford, K., Wood, V., Bahler, J., Lock, A., Kersey, P. J., McDowall, M. D., Staines, D. M., Dwinell, M., Shimoyama, M., Laulederkind, S., Hayman, T., Wang, S., Petri, V., Lowry, T., D'Eustachio, P., Matthews, L., Amundsen, C. D., Balakrishnan, R., Binkley, G., Cherry, J. M., Christie, K. R., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Hong, E. L., Karra, K., Krieger, C. J., Miyasato, S. R., Nash, R. S., Park, J., Skrzypek, M. S., Weng, S., Wong, E. D., Berardini, T. Z., Li, D., Huala, E., Slonim, D., Wick, H., Thomas, P., Chan, J., Kishore, R., Sternberg, P., Van Auken, K., Howe, D., Westerfield, M. 2012; 40 (D1): D559-D564
  • Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic acids research Cherry, J. M., Hong, E. L., Amundsen, C., Balakrishnan, R., Binkley, G., Chan, E. T., Christie, K. R., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Karra, K., Krieger, C. J., Miyasato, S. R., Nash, R. S., Park, J., Skrzypek, M. S., Simison, M., Weng, S., Wong, E. D. 2012; 40 (Database issue): D700-5

    Abstract

    The Saccharomyces Genome Database (SGD, http://www.yeastgenome.org) is the community resource for the budding yeast Saccharomyces cerevisiae. The SGD project provides the highest-quality manually curated information from peer-reviewed literature. The experimental results reported in the literature are extracted and integrated within a well-developed database. These data are combined with quality high-throughput results and provided through Locus Summary pages, a powerful query engine and rich genome browser. The acquisition, integration and retrieval of these data allow SGD to facilitate experimental design and analysis by providing an encyclopedia of the yeast genome, its chromosomal features, their functions and interactions. Public access to these data is provided to researchers and educators via web pages designed for optimal ease of use.

    View details for DOI 10.1093/nar/gkr1029

    View details for PubMedID 22110037

  • Toward an interactive article: integrating journals and biological databases BMC BIOINFORMATICS Rangarajan, A., Schedl, T., Yook, K., Chan, J., Haenel, S., Otis, L., Faelten, S., DePellegrin-Connelly, T., Isaacson, R., Skrzypek, M. S., Marygold, S. J., Stefancsik, R., Cherry, J. M., Sternberg, P. W., Mueller, H. 2011; 12

    Abstract

    Journal articles and databases are two major modes of communication in the biological sciences, and thus integrating these critical resources is of urgent importance to increase the pace of discovery. Projects focused on bridging the gap between journals and databases have been on the rise over the last five years and have resulted in the development of automated tools that can recognize entities within a document and link those entities to a relevant database. Unfortunately, automated tools cannot resolve ambiguities that arise from one term being used to signify entities that are quite distinct from one another. Instead, resolving these ambiguities requires some manual oversight. Finding the right balance between the speed and portability of automation and the accuracy and flexibility of manual effort is a crucial goal to making text markup a successful venture.We have established a journal article mark-up pipeline that links GENETICS journal articles and the model organism database (MOD) WormBase. This pipeline uses a lexicon built with entities from the database as a first step. The entity markup pipeline results in links from over nine classes of objects including genes, proteins, alleles, phenotypes and anatomical terms. New entities and ambiguities are discovered and resolved by a database curator through a manual quality control (QC) step, along with help from authors via a web form that is provided to them by the journal. New entities discovered through this pipeline are immediately sent to an appropriate curator at the database. Ambiguous entities that do not automatically resolve to one link are resolved by hand ensuring an accurate link. This pipeline has been extended to other databases, namely Saccharomyces Genome Database (SGD) and FlyBase, and has been implemented in marking up a paper with links to multiple databases.Our semi-automated pipeline hyperlinks articles published in GENETICS to model organism databases such as WormBase. Our pipeline results in interactive articles that are data rich with high accuracy. The use of a manual quality control step sets this pipeline apart from other hyperlinking tools and results in benefits to authors, journals, readers and databases.

    View details for DOI 10.1186/1471-2105-12-175

    View details for Web of Science ID 000293000700001

    View details for PubMedID 21595960

  • Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION Costanzo, M. C., Park, J., Balakrishnan, R., Cherry, J. M., Hong, E. L. 2011

    Abstract

    Annotation using Gene Ontology (GO) terms is one of the most important ways in which biological information about specific gene products can be expressed in a searchable, computable form that may be compared across genomes and organisms. Because literature-based GO annotations are often used to propagate functional predictions between related proteins, their accuracy is critically important. We present a strategy that employs a comparison of literature-based annotations with computational predictions to identify and prioritize genes whose annotations need review. Using this method, we show that comparison of manually assigned 'unknown' annotations in the Saccharomyces Genome Database (SGD) with InterPro-based predictions can identify annotations that need to be updated. A survey of literature-based annotations and computational predictions made by the Gene Ontology Annotation (GOA) project at the European Bioinformatics Institute (EBI) across several other databases shows that this comparison strategy could be used to maintain and improve the quality of GO annotations for other organisms besides yeast. The survey also shows that although GOA-assigned predictions are the most comprehensive source of functional information for many genomes, a large proportion of genes in a variety of different organisms entirely lack these predictions but do have manual annotations. This underscores the critical need for manually performed, literature-based curation to provide functional information about genes that are outside the scope of widely used computational methods. Thus, the combination of manual and computational methods is essential to provide the most accurate and complete functional annotation of a genome. Database URL: http://www.yeastgenome.org.

    View details for DOI 10.1093/database/bar004

    View details for Web of Science ID 000299630600010

    View details for PubMedID 21411447

  • Towards BioDBcore: a community-defined information specification for biological databases NUCLEIC ACIDS RESEARCH Gaudet, P., Bairoch, A., Field, D., Sansone, S., Taylor, C., Attwood, T. K., Bateman, A., Blake, J. A., Bult, C. J., Cherry, J. M., Chisholm, R. L., Cochrane, G., Cook, C. E., Eppig, J. T., Galperin, M. Y., Gentleman, R., Goble, C. A., Gojobori, T., Hancock, J. M., Howe, D. G., Imanishi, T., Kelso, J., Landsman, D., Lewis, S. E., Karsch-Mizrachi, I., Orchard, S., Ouellette, B. F., Ranganathan, S., Richardson, L., Rocca-Serra, P., Schofield, P. N., Smedley, D., Southan, C., Tan, T. W., Tatusova, T., Whetzel, P. L., White, O., Yamasaki, C. 2011; 39: D7-D10

    Abstract

    The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.

    View details for DOI 10.1093/nar/gkq1173

    View details for Web of Science ID 000285831700002

    View details for PubMedID 21097465

  • Saccharomyces Genome Database provides mutant phenotype data NUCLEIC ACIDS RESEARCH Engel, S. R., Balakrishnan, R., Binkley, G., Christie, K. R., Costanzo, M. C., Dwight, S. S., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Hong, E. L., Krieger, C. J., Livstone, M. S., Miyasato, S. R., Nash, R., Oughtred, R., Park, J., Skrzypek, M. S., Weng, S., Wong, E. D., Dolinski, K., Botstein, D., Cherry, J. M. 2010; 38: D433-D436

    Abstract

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is a scientific database for the molecular biology and genetics of the yeast Saccharomyces cerevisiae, which is commonly known as baker's or budding yeast. The information in SGD includes functional annotations, mapping and sequence information, protein domains and structure, expression data, mutant phenotypes, physical and genetic interactions and the primary literature from which these data are derived. Here we describe how published phenotypes and genetic interaction data are annotated and displayed in SGD.

    View details for DOI 10.1093/nar/gkp917

    View details for Web of Science ID 000276399100068

    View details for PubMedID 19906697

  • The Gene Ontology in 2010: extensions and refinements The Gene Ontology Consortium NUCLEIC ACIDS RESEARCH Berardini, T. Z., Li, D., Huala, E., Bridges, S., Burgess, S., McCarthy, F., Carbon, S., Lewis, S. E., Mungall, C. J., Abdulla, A., Wood, V., Feltrin, E., Valle, G., Chisholm, R. L., Fey, P., Gaudet, P., Kibbe, W., Basu, S., Bushmanova, Y., Eilbeck, K., Siegele, D. A., McIntosh, B., Renfro, D., Zweifel, A., Hu, J. C., Ashburner, M., Tweedie, S., Alam-Faruque, Y., Apweiler, R., Auchinchloss, A., Bairoch, A., Barrell, D., Binns, D., Blatter, M., Bougueleret, L., Boutet, E., Breuza, L., Bridge, A., Browne, P., Chan, W. M., Coudert, E., Daugherty, L., Dimmer, E., Eberhardt, R., Estreicher, A., Famiglietti, L., Ferro-Rojas, S., Feuermann, M., Foulger, R., Gruaz-Gumowski, N., Hinz, U., Huntley, R., Jimenez, S., Jungo, F., Keller, G., Laiho, K., Legge, D., Lemercier, P., Lieberherr, D., Magrane, M., O'Donovan, C., Pedruzzi, I., Poux, S., Rivoire, C., Roechert, B., Sawford, T., Schneider, M., Stanley, E., Stutz, A., Sundaram, S., Tognolli, M., Xenarios, I., Harris, M. A., Deegan (nee Clark), J. I., Ireland, A., Lomax, J., Jaiswal, P., Chibucos, M., Giglio, M. G., Wortman, J., Hannick, L., Madupu, R., Botstein, D., Dolinski, K., Livstone, M. S., Oughtred, R., Blake, J. A., Bult, C., Diehl, A. D., Dolan, M., Drabkin, H., Eppig, J. T., Hill, D. P., Ni, L., Ringwald, M., Sitnikov, D., Collmer, C., Torto-Alalibo, T., Laulederkind, S., Shimoyama, M., Twigger, S., D'Eustachio, P., Matthews, L., Balakrishnan, R., Binkley, G., Cherry, J. M., Christie, K. R., Costanzo, M. C., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Hong, E. L., Krieger, C. J., Miyasato, S. R., Nash, R. S., Park, J., Skrzypek, M. S., Weng, S., Wong, E. D., Aslett, M., Chan, J., Kishore, R., Sternberg, P., Van Auken, K., Khodiyar, V. K., Lovering, R. C., Talmud, P. J., Howe, D., Westerfield, M. 2010; 38: D331-D335
  • The Gene Ontology's Reference Genome Project: A Unified Framework for Functional Annotation across Species PLOS COMPUTATIONAL BIOLOGY Gaudet, P., Chisholm, R., Berardini, T., Dimmer, E., Engel, S. R., Fey, P., Hill, D. P., Howe, D., Hu, J. C., Huntley, R., Khodiyar, V. K., Kishore, R., Li, D., Lovering, R. C., McCarthy, F., Ni, L., Petri, V., Siegele, D. A., Tweedie, S., Van Auken, K., Wood, V., Basu, S., Carbon, S., Dolan, M., Mungall, C. J., Dolinski, K., Thomas, P., Ashburner, M., Blake, J. A., Cherry, J. M., Lewis, S. E. 2009; 5 (7)

    Abstract

    The Gene Ontology (GO) is a collaborative effort that provides structured vocabularies for annotating the molecular function, biological role, and cellular location of gene products in a highly systematic way and in a species-neutral manner with the aim of unifying the representation of gene function across different organisms. Each contributing member of the GO Consortium independently associates GO terms to gene products from the organism(s) they are annotating. Here we introduce the Reference Genome project, which brings together those independent efforts into a unified framework based on the evolutionary relationships between genes in these different organisms. The Reference Genome project has two primary goals: to increase the depth and breadth of annotations for genes in each of the organisms in the project, and to create data sets and tools that enable other genome annotation efforts to infer GO annotations for homologous genes in their organisms. In addition, the project has several important incidental benefits, such as increasing annotation consistency across genome databases, and providing important improvements to the GO's logical structure and biological content.

    View details for DOI 10.1371/journal.pcbi.1000431

    View details for Web of Science ID 000269220100031

    View details for PubMedID 19578431

  • Functional annotations for the Saccharomyces cerevisiae genome: the knowns and the known unknowns TRENDS IN MICROBIOLOGY Christie, K. R., Hong, E. L., Cherry, J. M. 2009; 17 (7): 286-294

    Abstract

    The quest to characterize each of the genes of the yeast Saccharomyces cerevisiae has propelled the development and application of novel high-throughput (HTP) experimental techniques. To handle the enormous amount of information generated by these techniques, new bioinformatics tools and resources are needed. Gene Ontology (GO) annotations curated by the Saccharomyces Genome Database (SGD) have facilitated the development of algorithms that analyze HTP data and help predict functions for poorly characterized genes in S. cerevisiae and other organisms. Here, we describe how published results are incorporated into GO annotations at SGD and why researchers can benefit from using these resources wisely to analyze their HTP data and predict gene functions.

    View details for DOI 10.1016/j.tim.2009.04.005

    View details for Web of Science ID 000268616600005

    View details for PubMedID 19577472

  • New mutant phenotype data curation system in the Saccharomyces Genome Database. Database : the journal of biological databases and curation Costanzo, M. C., Skrzypek, M. S., Nash, R., Wong, E., Binkley, G., Engel, S. R., Hitz, B., Hong, E. L., Cherry, J. M. 2009; 2009: bap001

    Abstract

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) organizes and displays molecular and genetic information about the genes and proteins of baker's yeast, Saccharomyces cerevisiae. Mutant phenotype screens have been the starting point for a large proportion of yeast molecular biological studies, and are still used today to elucidate the functions of uncharacterized genes and discover new roles for previously studied genes. To greatly facilitate searching and comparison of mutant phenotypes across genes, we have devised a new controlled-vocabulary system for capturing phenotype information. Each phenotype annotation is represented as an 'observable', which is the entity, or process that is observed, and a 'qualifier' that describes the change in that entity or process in the mutant (e.g. decreased, increased, or abnormal). Additional information about the mutant, such as strain background, allele name, conditions under which the phenotype is observed, or the identity of relevant chemicals, is captured in separate fields. For each gene, a summary of the mutant phenotype information is displayed on the Locus Summary page, and the complete information is displayed in tabular format on the Phenotype Details Page. All of the information is searchable and may also be downloaded in bulk using SGD's Batch Download Tool or Download Data Files Page. In the future, phenotypes will be integrated with other curated data to allow searching across different types of functional information, such as genetic and physical interaction data and Gene Ontology annotations.Database URL:http://www.yeastgenome.org/

    View details for PubMedID 20157474

  • New mutant phenotype data curation system in the Saccharomyces Genome Database DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION Costanzo, M. C., Skrzypek, M. S., Nash, R., Wong, E., Binkley, G., Engel, S. R., Hitz, B., Hong, E. L., Cherry, J. M. 2009
  • Gene Ontology annotations at SGD: new data sources and annotation methods NUCLEIC ACIDS RESEARCH Hong, E. L., Balakrishnan, R., Dong, Q., Christie, K. R., Park, J., Binkley, G., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Krieger, C. J., Livstone, M. S., Miyasato, S. R., Nash, R. S., Oughtred, R., Skrzypek, M. S., Weng, S., Wong, E. D., Zhu, K. K., Dolinski, K., Botstein, D., Cherry, J. M. 2008; 36: D577-D581

    Abstract

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) collects and organizes biological information about the chromosomal features and gene products of the budding yeast Saccharomyces cerevisiae. Although published data from traditional experimental methods are the primary sources of evidence supporting Gene Ontology (GO) annotations for a gene product, high-throughput experiments and computational predictions can also provide valuable insights in the absence of an extensive body of literature. Therefore, GO annotations available at SGD now include high-throughput data as well as computational predictions provided by the GO Annotation Project (GOA UniProt; http://www.ebi.ac.uk/GOA/). Because the annotation method used to assign GO annotations varies by data source, GO resources at SGD have been modified to distinguish data sources and annotation methods. In addition to providing information for genes that have not been experimentally characterized, GO annotations from independent sources can be compared to those made by SGD to help keep the literature-based GO annotations current.

    View details for DOI 10.1093/nar/gkm909

    View details for Web of Science ID 000252545400104

    View details for PubMedID 17982175

  • Combining guilt-by-association and guilt-by-profiling to predict Saccharomyces cerevisiae gene function GENOME BIOLOGY Tian, W., Zhang, L. V., Tasan, M., Gibbons, F. D., King, O. D., Park, J., Wunderlich, Z., Cherry, J. M., Roth, F. P. 2008; 9

    Abstract

    Learning the function of genes is a major goal of computational genomics. Methods for inferring gene function have typically fallen into two categories: 'guilt-by-profiling', which exploits correlation between function and other gene characteristics; and 'guilt-by-association', which transfers function from one gene to another via biological relationships.We have developed a strategy ('Funckenstein') that performs guilt-by-profiling and guilt-by-association and combines the results. Using a benchmark set of functional categories and input data for protein-coding genes in Saccharomyces cerevisiae, Funckenstein was compared with a previous combined strategy. Subsequently, we applied Funckenstein to 2,455 Gene Ontology terms. In the process, we developed 2,455 guilt-by-profiling classifiers based on 8,848 gene characteristics and 12 functional linkage graphs based on 23 biological relationships.Funckenstein outperforms a previous combined strategy using a common benchmark dataset. The combination of 'guilt-by-profiling' and 'guilt-by-association' gave significant improvement over the component classifiers, showing the greatest synergy for the most specific functions. Performance was evaluated by cross-validation and by literature examination of the top-scoring novel predictions. These quantitative predictions should help prioritize experimental study of yeast gene functions.

    View details for Web of Science ID 000278173500007

    View details for PubMedID 18613951

  • The Gene Ontology project in 2008 NUCLEIC ACIDS RESEARCH Harris, M. A., Deegan, J. I., Lomax, J., Ashburner, M., Tweedie, S., Carbon, S., Lewis, S., Mungall, C., Day-Richter, J., Eilbeck, K., Blake, J. A., Bult, C., Diehl, A. D., Dolan, M., Drabkin, H., Eppig, J. T., Hill, D. P., Ni, L., Ringwald, M., Balakrishnan, R., Binkley, G., Cherry, J. M., Christie, K. R., Costanzo, M. C., Dong, Q., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Hong, E. L., Krieger, C. J., Miyasato, S. R., Nash, R. S., Park, J., Skrzypek, M. S., Weng, S., Wong, E. D., Zhu, K. K., Botstein, D., Dolinski, K., Livstone, M. S., Oughtred, R., Berardini, T., Li, D., Rhee, S. Y., Apweiler, R., Barrell, D., Camon, E., Dimmer, E., Huntley, R., Mulder, N., Khodiyar, V. K., Lovering, R. C., Povey, S., Chisholm, R., Fey, P., Gaudet, P., Kibbe, W., Kishore, R., Schwarz, E. M., Sternberg, P., Van Auken, K., Giglio, M. G., Hannick, L., Wortman, J., Aslett, M., Berriman, M., Wood, V., Jacob, H., Laulederkind, S., Petri, V., Shimoyama, M., Smith, J., Twigger, S., Jaiswal, P., Seigfried, T., Howe, D., Westerfield, M., Collmer, C., Torto-Alalibo, T., Feltrin, E., Valle, G., Bromberg, S., Burgess, S., McCarthy, F. 2008; 36: D440-D444

    Abstract

    The Gene Ontology (GO) project (http://www.geneontology.org/) provides a set of structured, controlled vocabularies for community use in annotating genes, gene products and sequences (also see http://www.sequenceontology.org/). The ontologies have been extended and refined for several biological areas, and improvements to the structure of the ontologies have been implemented. To improve the quantity and quality of gene product annotations available from its public repository, the GO Consortium has launched a focused effort to provide comprehensive and detailed annotation of orthologous genes across a number of 'reference' genomes, including human and several key model organisms. Software developments include two releases of the ontology-editing tool OBO-Edit, and improvements to the AmiGO browser interface.

    View details for DOI 10.1093/nar/gkm883

    View details for Web of Science ID 000252545400079

    View details for PubMedID 17984083

  • Mining experimental evidence of molecular function claims from the literature BIOINFORMATICS Crangle, C. E., Cherry, J. M., Hong, E. L., Zbyslaw, A. 2007; 23 (23): 3232-3240

    Abstract

    The rate at which gene-related findings appear in the scientific literature makes it difficult if not impossible for biomedical scientists to keep fully informed and up to date. The importance of these findings argues for the development of automated methods that can find, extract and summarize this information. This article reports on methods for determining the molecular function claims that are being made in a scientific article, specifically those that are backed by experimental evidence.The most significant result is that for molecular function claims based on direct assays, our methods achieved recall of 70.7% and precision of 65.7%. Furthermore, our methods correctly identified in the text 44.6% of the specific molecular function claims backed up by direct assays, but with a precision of only 0.92%, a disappointing outcome that led to an examination of the different kinds of errors. These results were based on an analysis of 1823 articles from the literature of Saccharomyces cerevisiae (budding yeast).The annotation files for S.cerevisiae are available from ftp://genome-ftp.stanford.edu/pub/yeast/data_download/literature_curation/gene_association.sgd.gz. The draft protocol vocabulary is available by request from the first author.

    View details for DOI 10.1093/bioinformatics/btm495

    View details for Web of Science ID 000251334800017

    View details for PubMedID 17942445

  • Expanded protein information at SGD: new pages and proteome browser NUCLEIC ACIDS RESEARCH Nash, R., Weng, S., Hitz, B., Balakrishnan, R., Christie, K. R., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hong, E. L., Livstone, M. S., Oughtred, R., Park, J., Skrzypek, M., Theesfeld, C. L., Binkley, G., Dong, Q., Lane, C., Miyasato, S., Sethuraman, A., Schroeder, M., Dolinski, K., Botstein, D., Cherry, J. M. 2007; 35: D468-D471

    Abstract

    The recent explosion in protein data generated from both directed small-scale studies and large-scale proteomics efforts has greatly expanded the quantity of available protein information and has prompted the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) to enhance the depth and accessibility of protein annotations. In particular, we have expanded ongoing efforts to improve the integration of experimental information and sequence-based predictions and have redesigned the protein information web pages. A key feature of this redesign is the development of a GBrowse-derived interactive Proteome Browser customized to improve the visualization of sequence-based protein information. This Proteome Browser has enabled SGD to unify the display of hidden Markov model (HMM) domains, protein family HMMs, motifs, transmembrane regions, signal peptides, hydropathy plots and profile hits using several popular prediction algorithms. In addition, a physico-chemical properties page has been introduced to provide easy access to basic protein information. Improvements to the layout of the Protein Information page and integration of the Proteome Browser will facilitate the ongoing expansion of sequence-specific experimental information captured in SGD, including post-translational modifications and other user-defined annotations. Finally, SGD continues to improve upon the availability of genetic and physical interaction data in an ongoing collaboration with BioGRID by providing direct access to more than 82,000 manually-curated interactions.

    View details for DOI 10.1093/nar/gkl931

    View details for Web of Science ID 000243494600095

    View details for PubMedID 17142221

  • Saccharomyces cerevisiae S288C genome annotation: a working hypothesis YEAST Fisk, D. G., Ball, C. A., Dolinski, K., Engel, S. R., Hong, E. L., Issel-Tarver, L., Schwartz, K., Sethuraman, A., Botstein, D., Cherry, J. M. 2006; 23 (12): 857-865

    Abstract

    The S. cerevisiae genome is the most well-characterized eukaryotic genome and one of the simplest in terms of identifying open reading frames (ORFs), yet its primary annotation has been updated continually in the decade since its initial release in 1996 (Goffeau et al., 1996). The Saccharomyces Genome Database (SGD; www.yeastgenome.org) (Hirschman et al., 2006), the community-designated repository for this reference genome, strives to ensure that the S. cerevisiae annotation is as accurate and useful as possible. At SGD, the S. cerevisiae genome sequence and annotation are treated as a working hypothesis, which must be repeatedly tested and refined. In this paper, in celebration of the tenth anniversary of the completion of the S. cerevisiae genome sequence, we discuss the ways in which the S. cerevisiae sequence and annotation have changed, consider the multiple sources of experimental and comparative data on which these changes are based, and describe our methods for evaluating, incorporating and documenting these new data.

    View details for DOI 10.1002/yea.1400

    View details for Web of Science ID 000242009800002

    View details for PubMedID 17001629

  • Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote PLOS BIOLOGY Eisen, J. A., Coyne, R. S., Wu, M., Wu, D., Thiagarajan, M., Wortman, J. R., Badger, J. H., Ren, Q., Amedeo, P., Jones, K. M., Tallon, L. J., Delcher, A. L., Salzberg, S. L., Silva, J. C., Haas, B. J., Majoros, W. H., Farzad, M., Carlton, J. M., Smith, R. K., Garg, J., Pearlman, R. E., Karrer, K. M., Sun, L., Manning, G., Elde, N. C., Turkewitz, A. P., Asai, D. J., Wilkes, D. E., Wang, Y., Cai, H., Collins, K., Stewart, A., Lee, S. R., Wilamowska, K., Weinberg, Z., Ruzzo, W. L., Wloga, D., Gaertig, J., Frankel, J., Tsao, C., Gorovsky, M. A., Keeling, P. J., Waller, R. F., Patron, N. J., Cherry, J. M., Stover, N. A., Krieger, C. J., del Toro, C., Ryder, H. F., Williamson, S. C., Barbeau, R. A., Hamilton, E. P., Orias, E. 2006; 4 (9): 1620-1642

    Abstract

    The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model organisms makes T. thermophila an ideal model for functional genomic studies to address biological, biomedical, and biotechnological questions of fundamental importance.

    View details for DOI 10.1371/journal.pbio.0040286

    View details for Web of Science ID 000240740900012

    View details for PubMedID 16933976

  • The Gene Ontology (GO) project in 2006 NUCLEIC ACIDS RESEARCH Harris, M. A., Clark, J. I., Ireland, A., Lomax, J., Ashburner, M., Collins, R., Eilbeck, K., Lewis, S., Mungall, C., Richter, J., Rubin, G. M., Shu, S., Blake, J. A., Bult, C. J., Diehl, A. D., Dolan, M. E., Drabkin, H. J., Eppig, J. T., Hill, D. P., Ni, L., Ringwald, M., Balakrishnan, R., Binkley, G., Cherry, J. M., Christie, K. R., Costanzo, M. C., Dong, Q., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Hong, E. L., Lane, C., Miyasato, S., Nash, R., Sethuraman, A., Skrzypek, M., Theesfeld, C. L., Weng, S., Botstein, D., Dolinski, K., Oughtred, R., Berardini, T., Mundodi, S., Rhee, S. Y., Apweiler, R., Barrell, D., Camon, E., Dimmer, E., Mulder, N., Chisholm, R., Fey, P., Gaudet, P., Kibbe, W., Pilcher, K., Bastiani, C. A., Kishore, R., Schwarz, E. M., Sternberg, P., Van Auken, K., Gwinn, M., Hannick, L., Wortman, J., Aslett, M., Berriman, M., Wood, V., Bromberg, S., Foote, C., Jacob, H., Pasko, D., Petri, V., Reilly, D., Seiler, K., Shimoyama, M., Smith, J., Twigger, S., Jaiswal, P., Seigfried, T., Collmer, C., Howe, D., Westerfield, M. 2006; 34: D322-D326

    Abstract

    The Gene Ontology (GO) project (http://www.geneontology.org) develops and uses a set of structured, controlled vocabularies for community use in annotating genes, gene products and sequences (also see http://song.sourceforge.net/). The GO Consortium continues to improve to the vocabulary content, reflecting the impact of several novel mechanisms of incorporating community input. A growing number of model organism databases and genome annotation groups contribute annotation sets using GO terms to GO's public repository. Updates to the AmiGO browser have improved access to contributed genome annotations. As the GO project continues to grow, the use of the GO vocabularies is becoming more varied as well as more widespread. The GO project provides an ontological annotation system that enables biologists to infer knowledge from large amounts of data.

    View details for DOI 10.1093/nar/gkj021

    View details for Web of Science ID 000239307700070

    View details for PubMedID 16381878

  • Tetrahymena Genome Database (TGD): a new genomic resource for Tetrahymena thermophila research NUCLEIC ACIDS RESEARCH Stover, N. A., Krieger, C. J., Binkley, G., Dong, Q., Fisk, D. G., Nash, R., Sethuraman, A., Weng, S., Cherry, J. M. 2006; 34: D500-D503

    Abstract

    We have developed a web-based resource (available at www.ciliate.org) for researchers studying the model ciliate organism Tetrahymena thermophila. Employing the underlying database structure and programming of the Saccharomyces Genome Database, the Tetrahymena Genome Database (TGD) integrates the wealth of knowledge generated by the Tetrahymena research community about genome structure, genes and gene products with the newly sequenced macronuclear genome determined by The Institute for Genomic Research (TIGR). TGD provides information curated from the literature about each published gene, including a standardized gene name, a link to the genomic locus in our graphical genome browser, gene product annotations utilizing the Gene Ontology, links to published literature about the gene and more. TGD also displays automatic annotations generated for the gene models predicted by TIGR. A variety of tools are available at TGD for searching the Tetrahymena genome, its literature and information about members of the research community.

    View details for DOI 10.1093/nar/gkj054

    View details for Web of Science ID 000239307700109

    View details for PubMedID 16381920

  • Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome NUCLEIC ACIDS RESEARCH Hirschman, J. E., Balakrishnan, R., Christie, K. R., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hong, E. L., Livstone, M. S., Nash, R., Park, J., Oughtred, R., Skrzypek, M., Starr, B., Theesfeld, C. L., Williams, J., Andrada, R., Binkley, G., Dong, Q., Lane, C., Miyasato, S., Sethuraman, A., Schroeder, M., Thanawala, M. K., Weng, S., Dolinski, K., Botstein, D., Cherry, J. M. 2006; 34: D442-D445

    Abstract

    Sequencing and annotation of the entire Saccharomyces cerevisiae genome has made it possible to gain a genome-wide perspective on yeast genes and gene products. To make this information available on an ongoing basis, the Saccharomyces Genome Database (SGD) (http://www.yeastgenome.org/) has created the Genome Snapshot (http://db.yeastgenome.org/cgi-bin/genomeSnapShot.pl). The Genome Snapshot summarizes the current state of knowledge about the genes and chromosomal features of S.cerevisiae. The information is organized into two categories: (i) number of each type of chromosomal feature annotated in the genome and (ii) number and distribution of genes annotated to Gene Ontology terms. Detailed lists are accessible through SGD's Advanced Search tool (http://db.yeastgenome.org/cgi-bin/search/featureSearch), and all the data presented on this page are available from the SGD ftp site (ftp://ftp.yeastgenome.org/yeast/).

    View details for DOI 10.1093/nar/gkj117

    View details for Web of Science ID 000239307700097

    View details for PubMedID 16381907

  • PatMatch: a program for finding patterns in peptide and nucleotide sequences NUCLEIC ACIDS RESEARCH Yan, T., Yoo, D., Berardini, T. Z., Mueller, L. A., Weems, D. C., Weng, S., Cherry, J. M., Rhee, S. Y. 2005; 33: W262-W266

    Abstract

    Here, we present PatMatch, an efficient, web-based pattern-matching program that enables searches for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences. The program can be used to find matches to a user-specified sequence pattern that can be described using ambiguous sequence codes and a powerful and flexible pattern syntax based on regular expressions. A recent upgrade has improved performance and now supports both mismatches and wildcards in a single pattern. This enhancement has been achieved by replacing the previous searching algorithm, scan_for_matches [D'Souza et al. (1997), Trends in Genetics, 13, 497-498], with nondeterministic-reverse grep (NR-grep), a general pattern matching tool that allows for approximate string matching [Navarro (2001), Software Practice and Experience, 31, 1265-1312]. We have tailored NR-grep to be used for DNA and protein searches with PatMatch. The stand-alone version of the software can be adapted for use with any sequence dataset and is available for download at The Arabidopsis Information Resource (TAIR) at ftp://ftp.arabidopsis.org/home/tair/Software/Patmatch/. The PatMatch server is available on the web at http://www.arabidopsis.org/cgi-bin/patmatch/nph-patmatch.pl for searching Arabidopsis thaliana sequences.

    View details for DOI 10.1093/nar/gki368

    View details for Web of Science ID 000230271400050

    View details for PubMedID 15980466

  • Inference of combinatorial regulation in yeast transcriptional networks: A case study of sporulation PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Wang, W., Cherry, J. M., Nochomovitz, Y., Jolly, E., Botstein, D., Li, H. 2005; 102 (6): 1998-2003

    Abstract

    Decomposing transcriptional regulatory networks into functional modules and determining logical relations between them is the first step toward understanding transcriptional regulation at the system level. Modules based on analysis of genome-scale data can serve as the basis for inferring combinatorial regulation and for building mathematical models to quantitatively describe the behavior of the networks. We present here an algorithm called modem to identify target genes of a transcription factor (TF) from a single expression experiment, based on a joint probabilistic model for promoter sequence and gene expression data. We show how this method can facilitate the discovery of specific instances of combinatorial regulation and illustrate this for a specific case of transcriptional networks that regulate sporulation in the yeast Saccharomyces cerevisiae. Applying this method to analyze two crucial TFs in sporulation, Ndt80p and Sum1p, we were able to delineate their overlapping binding sites. We proposed a mechanistic model for the competitive regulation by the two TFs on a defined subset of sporulation genes. We show that this model accounts for the temporal control of the "middle" sporulation genes and suggest a similar regulatory arrangement can be found in developmental programs in higher organisms.

    View details for Web of Science ID 000227072900037

    View details for PubMedID 15684073

  • Fungal BLAST and Model Organism BLASTP Best Hits: new comparison resources at the Saccharomyces Genome Database (SGD) NUCLEIC ACIDS RESEARCH Balakrishnan, R., Christie, K. R., Costanzo, M. C., Dolinski, K., Dwight, S. S., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hong, E. L., Nash, R., Oughtred, R., Skrzypek, M., Theesfeld, C. L., Binkley, G., Dong, Q., Lane, C., Sethuraman, A., Weng, S., Botstein, D., Cherry, J. M. 2005; 33: D374-D377

    Abstract

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is a scientific database of gene, protein and genomic information for the yeast Saccharomyces cerevisiae. SGD has recently developed two new resources that facilitate nucleotide and protein sequence comparisons between S.cerevisiae and other organisms. The Fungal BLAST tool provides directed searches against all fungal nucleotide and protein sequences available from GenBank, divided into categories according to organism, status of completeness and annotation, and source. The Model Organism BLASTP Best Hits resource displays, for each S.cerevisiae protein, the single most similar protein from several model organisms and presents links to the database pages of those proteins, facilitating access to curated information about potential orthologs of yeast proteins.

    View details for DOI 10.1093/nar/gki023

    View details for Web of Science ID 000226524300077

    View details for PubMedID 15608219

  • GO::TermFinder - open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes BIOINFORMATICS Boyle, E. I., Weng, S. A., Gollub, J., Jin, H., Botstein, D., Cherry, J. M., Sherlock, G. 2004; 20 (18): 3710-3715

    Abstract

    GO::TermFinder comprises a set of object-oriented Perl modules for accessing Gene Ontology (GO) information and evaluating and visualizing the collective annotation of a list of genes to GO terms. It can be used to draw conclusions from microarray and other biological data, calculating the statistical significance of each annotation. GO::TermFinder can be used on any system on which Perl can be run, either as a command line application, in single or batch mode, or as a web-based CGI script.The full source code and documentation for GO::TermFinder are freely available from http://search.cpan.org/dist/GO-TermFinder/.

    View details for DOI 10.1093/bioinformatics/bth456

    View details for Web of Science ID 000225786600064

    View details for PubMedID 15297299

  • Saccharomyces genome database: Underlying principles and organisation BRIEFINGS IN BIOINFORMATICS Dwight, S. S., Balakrishnan, R., Christie, K. R., Costanzo, M. C., Dolinski, K., Engel, S. R., Feierbach, B., Fisk, D. G., Hirschman, J., Hong, E. L., Issel-Tarver, L., Nash, R. S., Sethuraman, A., Starr, B., Theesfeld, C. L., Andrada, R., Binkley, G., Dong, Q., Lane, C., Schroeder, M., Weng, S., Botstein, D., Cherry, J. M. 2004; 5 (1): 9-22

    Abstract

    A scientific database can be a powerful tool for biologists in an era where large-scale genomic analysis, combined with smaller-scale scientific results, provides new insights into the roles of genes and their products in the cell. However, the collection and assimilation of data is, in itself, not enough to make a database useful. The data must be incorporated into the database and presented to the user in an intuitive and biologically significant manner. Most importantly, this presentation must be driven by the user's point of view; that is, from a biological perspective. The success of a scientific database can therefore be measured by the response of its users - statistically, by usage numbers and, in a less quantifiable way, by its relationship with the community it serves and its ability to serve as a model for similar projects. Since its inception ten years ago, the Saccharomyces Genome Database (SGD) has seen a dramatic increase in its usage, has developed and maintained a positive working relationship with the yeast research community, and has served as a template for at least one other database. The success of SGD, as measured by these criteria, is due in large part to philosophies that have guided its mission and organisation since it was established in 1993. This paper aims to detail these philosophies and how they shape the organisation and presentation of the database.

    View details for Web of Science ID 000222244300002

    View details for PubMedID 15153302

  • The Gene Ontology (GO) database and informatics resource NUCLEIC ACIDS RESEARCH Harris, M. A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., Foulger, R., Eilbeck, K., Lewis, S., Marshall, B., Mungall, C., Richter, J., Rubin, G. M., Blake, J. A., Bult, C., Dolan, M., Drabkin, H., Eppig, J. T., Hill, D. P., Ni, L., RINGWALD, M., Balakrishnan, R., Cherry, J. M., Christie, K. R., Costanzo, M. C., Dwight, S. S., Engel, S., Fisk, D. G., Hirschman, J. E., Hong, E. L., Nash, R. S., Sethuraman, A., Theesfeld, C. L., Botstein, D., Dolinski, K., Feierbach, B., Berardini, T., Mundodi, S., Rhee, S. Y., Apweiler, R., Barrell, D., Camon, E., Dimmer, E., Lee, V., Chisholm, R., Gaudet, P., Kibbe, W., Kishore, R., Schwarz, E. M., Sternberg, P., Gwinn, M., Hannick, L., Wortman, J., Berriman, M., Wood, V., de la Cruz, N., Tonellato, P., Jaiswal, P., Seigfried, T., White, R. 2004; 32: D258-D261

    Abstract

    The Gene Ontology (GO) project (http://www. geneontology.org/) provides structured, controlled vocabularies and classifications that cover several domains of molecular and cellular biology and are freely available for community use in the annotation of genes, gene products and sequences. Many model organism databases and genome annotation groups use the GO and contribute their annotation sets to the GO resource. The GO database integrates the vocabularies and contributed annotations and provides full access to this information in several formats. Members of the GO Consortium continually work collectively, involving outside experts as needed, to expand and update the GO vocabularies. The GO Web resource also provides access to extensive documentation about the GO project and links to applications that use GO data for functional analyses.

    View details for DOI 10.1093/nar/gkh036

    View details for Web of Science ID 000188079000059

    View details for PubMedID 14681407

  • Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms NUCLEIC ACIDS RESEARCH Christie, K. R., Weng, S., Balakrishnan, R., Costanzo, M. C., Dolinski, K., Dwight, S. S., Engel, S. R., Feierbach, B., Fisk, D. G., Hirschman, J. E., Hong, E. L., Issel-Tarver, L., Nash, R., Sethuraman, A., Starr, B., Theesfeld, C. L., Andrada, R., Binkley, G., Dong, Q., Lane, C., Schroeder, M., Botstein, D., Cherry, J. M. 2004; 32: D311-D314

    Abstract

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/), a scientific database of the molecular biology and genetics of the yeast Saccharomyces cerevisiae, has recently developed several new resources that allow the comparison and integration of information on a genome-wide scale, enabling the user not only to find detailed information about individual genes, but also to make connections across groups of genes with common features and across different species. The Fungal Alignment Viewer displays alignments of sequences from multiple fungal genomes, while the Sequence Similarity Query tool displays PSI-BLAST alignments of each S.cerevisiae protein with similar proteins from any species whose sequences are contained in the non-redundant (nr) protein data set at NCBI. The Yeast Biochemical Pathways tool integrates groups of genes by their common roles in metabolism and displays the metabolic pathways in a graphical form. Finally, the Find Chromosomal Features search interface provides a versatile tool for querying multiple types of information in SGD.

    View details for DOI 10.1093/nar/gkh033

    View details for Web of Science ID 000188079000073

    View details for PubMedID 14681421

  • Saccharomyces Genome Database (SGD) provides biochemical and structural information for budding yeast proteins NUCLEIC ACIDS RESEARCH Weng, S., Dong, Q., Balakrishnan, R., Christie, K., Costanzo, M., Dolinski, K., Dwight, S. S., Engel, S., Fisk, D. G., Hong, E., Issel-Tarver, L., Sethuraman, A., Theesfeld, C., Andrada, R., Binkley, G., Lane, C., Schroeder, M., Botstein, D., Cherry, J. M. 2003; 31 (1): 216-218

    Abstract

    The Saccharomyces Genome Database (SGD: http://genome-www.stanford.edu/Saccharomyces/) has recently developed new resources to provide more complete information about proteins from the budding yeast Saccharomyces cerevisiae. The PDB Homologs page provides structural information from the Protein Data Bank (PDB) about yeast proteins and/or their homologs. SGD has also created a resource that utilizes the eMOTIF database for motif information about a given protein. A third new resource is the Protein Information page, which contains protein physical and chemical properties, such as molecular weight and hydropathicity scores, predicted from the translated ORF sequence.

    View details for DOI 10.1093/nar/gkg054

    View details for Web of Science ID 000181079700049

    View details for PubMedID 12519985

  • Gene function, metabolic pathways and comparative genomics in yeast PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE Dong, Q., Balakrishnan, R., Binkley, G., Christie, K. R., Costanzo, M., Dolinski, K., Dwight, S. S., Engel, S., Fisk, D. G., Hirschman, J., Hong, E. L., Nash, R., Issel-Tarver, L., Sethuraman, A., Theesfeld, C. L., Weng, S., Botstein, D., Cherry, J. M. 2003: 437-438
  • SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data NUCLEIC ACIDS RESEARCH Diehn, M., Sherlock, G., Binkley, G., Jin, H., Matese, J. C., Hernandez-Boussard, T., Rees, C. A., Cherry, J. M., Botstein, D., Brown, P. O., Alizadeh, A. A. 2003; 31 (1): 219-223

    Abstract

    The explosion in the number of functional genomic datasets generated with tools such as DNA microarrays has created a critical need for resources that facilitate the interpretation of large-scale biological data. SOURCE is a web-based database that brings together information from a broad range of resources, and provides it in manner particularly useful for genome-scale analyses. SOURCE's GeneReports include aliases, chromosomal location, functional descriptions, GeneOntology annotations, gene expression data, and links to external databases. We curate published microarray gene expression datasets and allow users to rapidly identify sets of co-regulated genes across a variety of tissues and a large number of conditions using a simple and intuitive interface. SOURCE provides content both in gene and cDNA clone-centric pages, and thus simplifies analysis of datasets generated using cDNA microarrays. SOURCE is continuously updated and contains the most recent and accurate information available for human, mouse, and rat genes. By allowing dynamic linking to individual gene or clone reports, SOURCE facilitates browsing of large genomic datasets. Finally, SOURCEs batch interface allows rapid extraction of data for thousands of genes or clones at once and thus facilitates statistical analyses such as assessing the enrichment of functional attributes within clusters of genes. SOURCE is available at http://source.stanford.edu.

    View details for DOI 10.1093/nar/gkg014

    View details for Web of Science ID 000181079700050

    View details for PubMedID 12519986

  • A systematic approach to reconstructing transcription networks in Saccharomyces cerevisiae PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Wang, W., Cherry, J. M., Botstein, D., Li, H. 2002; 99 (26): 16893-16898

    Abstract

    Decomposing regulatory networks into functional modules is a first step toward deciphering the logical structure of complex networks. We propose a systematic approach to reconstructing transcription modules (defined by a transcription factor and its target genes) and identifying conditionsperturbations under which a particular transcription module is activateddeactivated. Our approach integrates information from regulatory sequences, genome-wide mRNA expression data, and functional annotation. We systematically analyzed gene expression profiling experiments in which the yeast cell was subjected to various environmental or genetic perturbations. We were able to construct transcription modules with high specificity and sensitivity for many transcription factors, and predict the activation of these modules under anticipated as well as unexpected conditions. These findings generate testable hypotheses when combined with existing knowledge on signaling pathways and protein-protein interactions. Correlating the activation of a module to a specific perturbation predicts links in the cell's regulatory networks, and examining coactivated modules suggests specific instances of crosstalk between regulatory pathways.

    View details for DOI 10.1073/pnas.252638199

    View details for Web of Science ID 000180101600070

    View details for PubMedID 12482955

  • Identification of unstable transcripts in Arabidopsis by cDNA microarray analysis: Rapid decay is associated with a group of touch- and specific clock-controlled genes PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Gutierrez, R. A., Ewing, R. M., Cherry, J. M., Green, P. J. 2002; 99 (17): 11513-11518

    Abstract

    mRNA degradation provides a powerful means for controlling gene expression during growth, development, and many physiological transitions in plants and other systems. Rates of decay help define the steady state levels to which transcripts accumulate in the cytoplasm and determine the speed with which these levels change in response to the appropriate signals. When fast responses are to be achieved, rapid decay of mRNAs is necessary. Accordingly, genes with unstable transcripts often encode proteins that play important regulatory roles. Although detailed studies have been carried out on individual genes with unstable transcripts, there is limited knowledge regarding their nature and associations from a genomic perspective, or the physiological significance of rapid mRNA turnover in intact organisms. To address these problems, we have applied cDNA microarray analysis to identify and characterize genes with unstable transcripts in Arabidopsis thaliana (AtGUTs). Our studies showed that at least 1% of the 11,521 clones represented on Arabidopsis Functional Genomics Consortium microarrays correspond to transcripts that are rapidly degraded, with estimated half-lives of less than 60 min. AtGUTs encode proteins that are predicted to participate in a broad range of cellular processes, with transcriptional functions being over-represented relative to the whole Arabidopsis genome annotation. Analysis of public microarray expression data for these genes argues that mRNA instability is of high significance during plant responses to mechanical stimulation and is associated with specific genes controlled by the circadian clock.

    View details for DOI 10.1073/pnas.152204099

    View details for Web of Science ID 000177606900100

    View details for PubMedID 12167669

  • Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) NUCLEIC ACIDS RESEARCH Dwight, S. S., Harris, M. A., Dolinski, K., Ball, C. A., Binkley, G., Christie, K. R., Fisk, D. G., Issel-Tarver, L., Schroeder, M., Sherlock, G., Sethuraman, A., Weng, S., Botstein, D., Cherry, J. M. 2002; 30 (1): 69-72

    Abstract

    The Saccharomyces Genome Database (SGD) resources, ranging from genetic and physical maps to genome-wide analysis tools, reflect the scientific progress in identifying genes and their functions over the last decade. As emphasis shifts from identification of the genes to identification of the role of their gene products in the cell, SGD seeks to provide its users with annotations that will allow relationships to be made between gene products, both within Saccharomyces cerevisiae and across species. To this end, SGD is annotating genes to the Gene Ontology (GO), a structured representation of biological knowledge that can be shared across species. The GO consists of three separate ontologies describing molecular function, biological process and cellular component. The goal is to use published information to associate each characterized S.cerevisiae gene product with one or more GO terms from each of the three ontologies. To be useful, this must be done in a manner that allows accurate associations based on experimental evidence, modifications to GO when necessary, and careful documentation of the annotations through evidence codes for given citations. Reaching this goal is an ongoing process at SGD. For information on the current progress of GO annotations at SGD and other participating databases, as well as a description of each of the three ontologies, please visit the GO Consortium page at http://www.geneontology.org. SGD gene associations to GO can be found by visiting our site at http://genome-www.stanford.edu/Saccharomyces/.

    View details for Web of Science ID 000173077100017

    View details for PubMedID 11752257

  • Saccharomyces genome database GUIDE TO YEAST GENETICS AND MOLECULAR AND CELL BIOLOGY, PT B Issel-Tarver, L., Christie, K. R., Dolinski, K., Andrada, R., Balakrishnan, R., Ball, C. A., Binkley, G., Dong, S., Dwight, S. S., Fisk, D. G., Harris, M., Schroeder, M., Sethuraman, A., Tse, K., Weng, S., Botstein, D., Cherry, J. M. 2002; 350: 329-346

    View details for Web of Science ID 000176466300019

    View details for PubMedID 12073322

  • Microarray data quality analysis: lessons from the AFGC project PLANT MOLECULAR BIOLOGY Finkelstein, D., Ewing, R., Gollub, J., Sterky, F., Cherry, J. M., Somerville, S. 2002; 48 (1-2): 119-131

    Abstract

    Genome-wide expression profiling with DNA microarrays has and will provide a great deal of data to the plant scientific community. However, reliability concerns have required the development data quality tests for common systematic biases. Fortunately, most large-scale systematic biases are detectable and some are correctable by normalization. Technical replication experiments and statistical surveys indicate that these biases vary widely in severity and appearance. As a result, no single normalization or correction method currently available is able to address all the issues. However, careful sequence selection, array design, experimental design and experimental annotation can substantially improve the quality and biological of microarray data. In this review, we discuss these issues with reference to examples from the Arabidopsis Functional Genomics Consortium (AFGC) microarray project.

    View details for Web of Science ID 000173211000008

  • Creating the gene ontology resource: Design and implementation GENOME RESEARCH Ashburner, M., Ball, C. A., Blake, J. A., Butler, H., Cherry, J. M., Corradi, J., Dolinski, K., Eppig, J. T., Harris, M., Hill, D. P., Lewis, S., Marshall, B., Mungall, C., Reiser, L., Rhee, S., Richardson, J. E., Richter, J., RINGWALD, M., Rubin, G. M., Sherlock, G., Yoon, J. 2001; 11 (8): 1425-1433

    Abstract

    The exponential growth in the volume of accessible biological information has generated a confusion of voices surrounding the annotation of molecular information about genes and their products. The Gene Ontology (GO) project seeks to provide a set of structured vocabularies for specific biological domains that can be used to describe gene products in any organism. This work includes building three extensive ontologies to describe molecular function, biological process, and cellular component, and providing a community database resource that supports the use of these ontologies. The GO Consortium was initiated by scientists associated with three model organism databases: SGD, the Saccharomyces Genome database; FlyBase, the Drosophila genome database; and MGD/GXD, the Mouse Genome Informatics databases. Additional model organism database groups are joining the project. Each of these model organism information systems is annotating genes and gene products using GO vocabulary terms and incorporating these annotations into their respective model organism databases. Each database contributes its annotation files to a shared GO data resource accessible to the public at http://www.geneontology.org/. The GO site can be used by the community both to recover the GO vocabularies and to access the annotated gene product data sets from the model organism databases. The GO Consortium supports the development of the GO database resource and provides tools enabling curators and researchers to query and manipulate the vocabularies. We believe that the shared development of this molecular annotation resource will contribute to the unification of biological information.

    View details for Web of Science ID 000170263900015

    View details for PubMedID 11483584

  • Visualization of expression clusters using Sammon's non-linear mapping BIOINFORMATICS Ewing, R. M., Cherry, J. M. 2001; 17 (7): 658-659

    Abstract

    A method of exploratory analysis and visualization of multi-dimensional gene expression data using Sammon's Non-Linear Mapping (NLM) is presented.

    View details for Web of Science ID 000170249100012

    View details for PubMedID 11448886

  • Computer manipulation of DNA and protein sequences. Current protocols in molecular biology / edited by Frederick M. Ausubel ... [et al.] Cherry, J. M. 2001; Chapter 7: Unit7 7-?

    Abstract

    This unit outlines a variety of methods by which DNA sequences can be manipulated by computers. Procedures for entering sequence data into the computer and assembling raw sequence data into a contiguous sequence are described first, followed by a description of methods of analyzing and manipulating sequences--e.g., verifying sequences, constructing restriction maps, designing oligonucleotides, identifying protein-coding regions, and predicting secondary structures. This unit also provides information on the large amount of software available for sequence analysis. The appendix to this unit lists some of the commercial software, shareware, and free software related to DNA sequence manipulation. The goal of this unit is to serve as a starting point for researchers interested in utilizing the tremendous sequencing resources available to the computer-knowledgeable molecular biology laboratory.

    View details for DOI 10.1002/0471142727.mb0707s30

    View details for PubMedID 18265271

  • Characteristics of amino acids. Current protocols in molecular biology / edited by Frederick M. Ausubel ... [et al.] Ellington, A., Cherry, J. M. 2001; Appendix 1: Appendix 1C-?

    Abstract

    This appendix presents useful basic information, including common abbreviations, useful measurements and data, characteristics of amino acids and nucleic acids, information on radioactivity and the safe use of radioisotopes and other hazardous chemicals, conversions for centrifuges and rotors, characteristics of common detergents, and common conversion factors.

    View details for DOI 10.1002/0471142727.mba01cs33

    View details for PubMedID 18265025

  • Genome comparisons highlight similarity and diversity within the eukaryotic kingdoms CURRENT OPINION IN CHEMICAL BIOLOGY Ball, C. A., Cherry, J. M. 2001; 5 (1): 86-89

    Abstract

    In 2000, the number of completely sequenced eukaryotic genomes increased to four. The addition of Drosophila and Arabidopsis into this cohort permits additional insights into the processes that have shaped evolution. Analysis and comparisons of both completed genomes and partially sequenced genomes have already shed light on mechanisms such as gene duplication and gene loss that have long been hypothesized to be major forces in speciation. Indeed, duplicate gene pairs in Saccharomyces, Arabidopsis, Caenorhabditis and Drosophila are high: 30%, 60%, 48% and 40%, respectively. Evidence of horizontal gene-transfer, thought to be a major evolutionary force in bacteria, has been found in Arabidopsis. The release of the 'first draft' of the human genome sequence in 2000 heralds a new stage of biological study. Understanding the as-yet-unannotated human genome will be largely based on conclusions, techniques and tools developed during the analysis and comparison of the genome of these four model organisms.

    View details for Web of Science ID 000167051500014

    View details for PubMedID 11166654

  • Saccharomyces Genome Database provides tools to survey gene expression and functional analysis data NUCLEIC ACIDS RESEARCH Ball, C. A., Jin, H., Sherlock, G., Weng, S., Matese, J. C., Andrada, R., Binkley, G., Dolinski, K., Dwight, S. S., Harris, M. A., Issel-Tarver, L., SCHROEDER, R., Botstein, D., Cherry, J. M. 2001; 29 (1): 80-81

    Abstract

    Upon the completion of the SACCHAROMYCES: cerevisiae genomic sequence in 1996 [Goffeau,A. et al. (1997) NATURE:, 387, 5], several creative and ambitious projects have been initiated to explore the functions of gene products or gene expression on a genome-wide scale. To help researchers take advantage of these projects, the SACCHAROMYCES: Genome Database (SGD) has created two new tools, Function Junction and Expression Connection. Together, the tools form a central resource for querying multiple large-scale analysis projects for data about individual genes. Function Junction provides information from diverse projects that shed light on the role a gene product plays in the cell, while Expression Connection delivers information produced by the ever-increasing number of microarray projects. WWW access to SGD is available at genome-www.stanford. edu/Saccharomyces/.

    View details for Web of Science ID 000166360300019

    View details for PubMedID 11125055

  • The Stanford Microarray Database NUCLEIC ACIDS RESEARCH Sherlock, G., Hernandez-Boussard, T., Kasarskis, A., Binkley, G., Matese, J. C., Dwight, S. S., Kaloper, M., Weng, S., Jin, H., Ball, C. A., Eisen, M. B., Spellman, P. T., Brown, P. O., Botstein, D., Cherry, J. M. 2001; 29 (1): 152-155

    Abstract

    The Stanford Microarray Database (SMD) stores raw and normalized data from microarray experiments, and provides web interfaces for researchers to retrieve, analyze and visualize their data. The two immediate goals for SMD are to serve as a storage site for microarray data from ongoing research at Stanford University, and to facilitate the public dissemination of that data once published, or released by the researcher. Of paramount importance is the connection of microarray data with the biological data that pertains to the DNA deposited on the microarray (genes, clones etc.). SMD makes use of many public resources to connect expression information to the relevant biology, including SGD [Ball,C.A., Dolinski,K., Dwight,S.S., Harris,M.A., Issel-Tarver,L., Kasarskis,A., Scafe,C.R., Sherlock,G., Binkley,G., Jin,H. et al. (2000) Nucleic Acids Res., 28, 77-80], YPD and WormPD [Costanzo,M.C., Hogan,J.D., Cusick,M.E., Davis,B.P., Fancher,A.M., Hodges,P.E., Kondu,P., Lengieza,C., Lew-Smith,J.E., Lingner,C. et al. (2000) Nucleic Acids Res., 28, 73-76], Unigene [Wheeler,D.L., Chappey,C., Lash,A.E., Leipe,D.D., Madden,T.L., Schuler,G.D., Tatusova,T.A. and Rapp,B.A. (2000) Nucleic Acids Res., 28, 10-14], dbEST [Boguski,M.S., Lowe,T.M. and Tolstoshev,C.M. (1993) Nature Genet., 4, 332-333] and SWISS-PROT [Bairoch,A. and Apweiler,R. (2000) Nucleic Acids Res., 28, 45-48] and can be accessed at http://genome-www.stanford.edu/microarray.

    View details for Web of Science ID 000166360300039

    View details for PubMedID 11125075

  • Gene Ontology: tool for the unification of biology NATURE GENETICS Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., Davis, A. P., Dolinski, K., Dwight, S. S., Eppig, J. T., Harris, M. A., Hill, D. P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J. C., Richardson, J. E., RINGWALD, M., Rubin, G. M., Sherlock, G. 2000; 25 (1): 25-29

    View details for Web of Science ID 000086884000011

    View details for PubMedID 10802651

  • Comparative genomics of the eukaryotes SCIENCE Rubin, G. M., Yandell, M. D., Wortman, J. R., Miklos, G. L., Nelson, C. R., Hariharan, I. K., Fortini, M. E., Li, P. W., Apweiler, R., Fleischmann, W., Cherry, J. M., Henikoff, S., Skupski, M. P., Misra, S., Ashburner, M., Birney, E., Boguski, M. S., Brody, T., Brokstein, P., Celniker, S. E., Chervitz, S. A., Coates, D., Cravchik, A., Gabrielian, A., Galle, R. F., Gelbart, W. M., George, R. A., Goldstein, L. S., Gong, F. C., Guan, P., Harris, N. L., Hay, B. A., Hoskins, R. A., Li, J. Y., Li, Z. Y., HYNES, R. O., Jones, S. J., Kuehl, P. M., Lemaitre, B., Littleton, J. T., Morrison, D. K., Mungall, C., O'Farrell, P. H., Pickeral, O. K., Shue, C., Vosshall, L. B., Zhang, J., Zhao, Q., Zheng, X. Q., Zhong, F., Zhong, W. Y., Gibbs, R., Venter, J. C., Adams, M. D., Lewis, S. 2000; 287 (5461): 2204-2215

    Abstract

    A comparative analysis of the genomes of Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae-and the proteins they are predicted to encode-was undertaken in the context of cellular, developmental, and evolutionary processes. The nonredundant protein sets of flies and worms are similar in size and are only twice that of yeast, but different gene families are expanded in each genome, and the multidomain proteins and signaling pathways of the fly and worm are far more complex than those of yeast. The fly has orthologs to 177 of the 289 human disease genes examined and provides the foundation for rapid analysis of some of the basic processes involved in human disease.

    View details for Web of Science ID 000086049100035

    View details for PubMedID 10731134

  • The genome sequence of Drosophila melanogaster SCIENCE Adams, M. D., Celniker, S. E., Holt, R. A., Evans, C. A., Gocayne, J. D., Amanatides, P. G., Scherer, S. E., Li, P. W., Hoskins, R. A., Galle, R. F., George, R. A., Lewis, S. E., Richards, S., Ashburner, M., Henderson, S. N., Sutton, G. G., Wortman, J. R., Yandell, M. D., Zhang, Q., Chen, L. X., Brandon, R. C., Rogers, Y. H., Blazej, R. G., Champe, M., Pfeiffer, B. D., Wan, K. H., Doyle, C., Baxter, E. G., Helt, G., Nelson, C. R., Miklos, G. L., Abril, J. F., Agbayani, A., An, H. J., Andrews-Pfannkoch, C., Baldwin, D., Ballew, R. M., Basu, A., Baxendale, J., Bayraktaroglu, L., Beasley, E. M., Beeson, K. Y., Benos, P. V., Berman, B. P., Bhandari, D., Bolshakov, S., Borkova, D., Botchan, M. R., Bouck, J., Brokstein, P., Brottier, P., Burtis, K. C., Busam, D. A., Butler, H., Cadieu, E., Center, A., Chandra, I., Cherry, J. M., Cawley, S., Dahlke, C., Davenport, L. B., DAVIES, A., de Pablos, B., Delcher, A., Deng, Z. M., Mays, A. D., Dew, I., Dietz, S. M., Dodson, K., Doup, L. E., Downes, M., Dugan-Rocha, S., Dunkov, B. C., Dunn, P., Durbin, K. J., Evangelista, C. C., Ferraz, C., Ferriera, S., Fleischmann, W., Fosler, C., Gabrielian, A. E., Garg, N. S., Gelbart, W. M., Glasser, K., Glodek, A., Gong, F. C., Gorrell, J. H., Gu, Z. P., Guan, P., Harris, M., Harris, N. L., Harvey, D., Heiman, T. J., HERNANDEZ, J. R., Houck, J., Hostin, D., Houston, D. A., Howland, T. J., Wei, M. H., Ibegwam, C., Jalali, M., Kalush, F., Karpen, G. H., Ke, Z. X., Kennison, J. A., Ketchum, K. A., Kimmel, B. E., Kodira, C. D., Kraft, C., Kravitz, S., Kulp, D., Lai, Z. W., Lasko, P., Lei, Y. D., Levitsky, A. A., Li, J. Y., Li, Z. Y., Liang, Y., Lin, X. Y., Liu, X. J., Mattei, B., McIntosh, T. C., McLeod, M. P., McPherson, D., Merkulov, G., Milshina, N. V., Mobarry, C., Morris, J., Moshrefi, A., Mount, S. M., Moy, M., Murphy, B., Murphy, L., Muzny, D. M., Nelson, D. L., Nelson, D. R., Nelson, K. A., Nixon, K., Nusskern, D. R., Pacleb, J. M., Palazzolo, M., Pittman, G. S., Pan, S., Pollard, J., Puri, V., Reese, M. G., Reinert, K., Remington, K., Saunders, R. D., Scheeler, F., Shen, H., Shue, B. C., Siden-Kiamos, I., Simpson, M., Skupski, M. P., Smith, T., Spier, E., Spradling, A. C., Stapleton, M., Strong, R., Sun, E., Svirskas, R., Tector, C., Turner, R., Venter, E., Wang, A. H., Wang, X., Wang, Z. Y., Wassarman, D. A., Weinstock, G. M., Weissenbach, J., Williams, S. M., Woodage, T., Worley, K. C., Wu, D., Yang, S., Yao, Q. A., Ye, J., Yeh, R. F., Zaveri, J. S., Zhan, M., Zhang, G. G., Zhao, Q., Zheng, L. S., Zheng, X. Q., Zhong, F. N., Zhong, W. Y., Zhou, X. J., Zhu, S. P., Zhu, X. H., Smith, H. O., Gibbs, R. A., Myers, E. W., Rubin, G. M., Venter, J. C. 2000; 287 (5461): 2185-2195

    Abstract

    The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes approximately 13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

    View details for Web of Science ID 000086049100033

    View details for PubMedID 10731132

  • Integrating functional genomic information into the Saccharomyces genome database NUCLEIC ACIDS RESEARCH Ball, C. A., Dolinski, K., Dwight, S. S., Harris, M. A., Issel-Tarver, L., Kasarskis, A., Scafe, C. R., Sherlock, G., Binkley, G., Jin, H., Kaloper, M., Orr, S. D., Schroeder, M., Weng, S., Zhu, Y., Botstein, D., Cherry, J. M. 2000; 28 (1): 77-80

    Abstract

    The Saccharomyces Genome Database (SGD) stores and organizes information about the nearly 6200 genes in the yeast genome. The information is organized around the 'locus page' and directs users to the detailed information they seek. SGD is endeavoring to integrate the existing information about yeast genes with the large volume of data generated by functional analyses that are beginning to appear in the literature and on web sites. New features will include searches of systematic analyses and Gene Summary Paragraphs that succinctly review the literature for each gene. In addition to current information, such as gene product and phenotype descriptions, the new locus page will also describe a gene product's cellular process, function and localization using a controlled vocabulary developed in collaboration with two other model organism databases. We describe these developments in SGD through the newly reorganized locus page. The SGD is accessible via the WWW at http://genome-www.stanford.edu/Saccharomyces/

    View details for Web of Science ID 000084896300020

    View details for PubMedID 10592186

  • Unified display of Arabidopsis thaliana physical maps from AtDB, the A.thaliana database NUCLEIC ACIDS RESEARCH Rhee, S. Y., Weng, S., Bongard-Pierce, D. K., Garcia-Hernandez, M., Malekian, A., Flanders, D. J., Cherry, J. M. 1999; 27 (1): 79-84

    Abstract

    In the past several years, there has been a tremendous effort to construct physical maps and to sequence the genome of Arabidopsis thaliana. As a result, four of the five chromosomes are completely covered by overlapping clones except at the centromeric and nucleolus organizer regions (NOR). In addition, over 30% of the genome has been sequenced and completion is anticipated by the end of the year 2000. Despite these accomplishments, the physical maps are provided in many formats on laboratories' Web sites. These data are thus difficult to obtain in a coherent manner for researchers. To alleviate this problem, AtDB (Arabidopsis thaliana DataBase, URL: http://genome-www.stanford.edu/Arabidopsis/) has constructed a unified display of the physical maps where all publicly available physical-map data for all chromosomes are presented through the Web in a clickable, 'on-the-fly' graphic, created by CGI programs that directly consult our relational database.

    View details for Web of Science ID 000077983000018

    View details for PubMedID 9847147

  • Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure NUCLEIC ACIDS RESEARCH Chervitz, S. A., Hester, E. T., Ball, C. A., Dolinski, K., Dwight, S. S., Harris, M. A., Juvik, G., Malekian, A., Roberts, S., Roe, T., Scafe, C., Schroeder, M., Sherlock, G., Weng, S., Zhu, Y., Cherry, J. M., Botstein, D. 1999; 27 (1): 74-78

    Abstract

    The Saccharomyces Genome Database (SGD) collects and organizes information about the molecular biology and genetics of the yeast Saccharomyces cerevisiae. The latest protein structure and comparison tools available at SGD are presented here. With the completion of the yeast sequence and the Caenorhabditis elegans sequence soon to follow, comparison of proteins from complete eukaryotic proteomes will be an extremely powerful way to learn more about a particular protein's structure, its function, and its relationships with other proteins. SGD can be accessed through the World Wide Web at http://genome-www.stanford.edu/Saccharomyces/

    View details for Web of Science ID 000077983000017

    View details for PubMedID 9847146

  • Comparison of the complete protein sets of worm and yeast: Orthology and divergence SCIENCE Chervitz, S. A., Aravind, L., Sherlock, G., Ball, C. A., Koonin, E. V., Dwight, S. S., Harris, M. A., Dolinski, K., Mohr, S., Smith, T., Weng, S., Cherry, J. M., Botstein, D. 1998; 282 (5396): 2022-2028

    Abstract

    Comparative analysis of predicted protein sequences encoded by the genomes of Caenorhabditis elegans and Saccharomyces cerevisiae suggests that most of the core biological functions are carried out by orthologous proteins (proteins of different species that can be traced back to a common ancestor) that occur in comparable numbers. The specialized processes of signal transduction and regulatory control that are unique to the multicellular worm appear to use novel proteins, many of which re-use conserved domains. Major expansion of the number of some of these domains seen in the worm may have contributed to the advent of multicellularity. The proteins conserved in yeast and worm are likely to have orthologs throughout eukaryotes; in contrast, the proteins unique to the worm may well define metazoans.

    View details for Web of Science ID 000077467100036

    View details for PubMedID 9851918

  • Expanding yeast knowledge online YEAST Dolinski, K., Ball, C. A., Chervitz, S. A., Dwight, S. S., Harris, M. A., Roberts, S., Roe, T., Cherry, J. M., Botstein, D. 1998; 14 (16): 1453-1469

    Abstract

    The completion of the Saccharomyces cerevisiae genome sequencing project and the continued development of improved technology for large-scale genome analysis have led to tremendous growth in the amount of new yeast genetics and molecular biology data. Efficient organization, presentation, and dissemination of this information are essential if researchers are to exploit this knowledge. In addition, the development of tools that provide efficient analysis of this information and link it with pertinent information from other systems is becoming increasingly important at a time when the complete genome sequences of other organisms are becoming available. The aim of this review is to familiarize biologists with the type of data resources currently available on the World Wide Web (WWW).

    View details for Web of Science ID 000077792400003

    View details for PubMedID 9885151

  • Arabidopsis thaliana: A model plant for genome analysis SCIENCE Meinke, D. W., Cherry, J. M., Dean, C., Rounsley, S. D., Koornneef, M. 1998; 282 (5389): 662-?

    Abstract

    Arabidopsis thaliana is a small plant in the mustard family that has become the model system of choice for research in plant biology. Significant advances in understanding plant growth and development have been made by focusing on the molecular genetics of this simple angiosperm. The 120-megabase genome of Arabidopsis is organized into five chromosomes and contains an estimated 20,000 genes. More than 30 megabases of annotated genomic sequence has already been deposited in GenBank by a consortium of laboratories in Europe, Japan, and the United States. The entire genome is scheduled to be sequenced by the end of the year 2000. Reaching this milestone should enhance the value of Arabidopsis as a model for plant biology and the analysis of complex organisms in general.

    View details for Web of Science ID 000076607500039

    View details for PubMedID 9784120

  • Genome maps 9. Arabidopsis thaliana. Wall chart. Science Rhee, S. Y., Weng, S., Flanders, D., Cherry, J. M., Dean, C., Lister, C., Anderson, M., Koornneef, M., Meinke, D. W., Nickle, T., Smith, K., Rounsley, S. D. 1998; 282 (5389): 663-667

    View details for PubMedID 9841422

  • AtDB, the Arabidopsis thaliana database, and graphical-web-display of progress by the Arabidopsis genome initiative NUCLEIC ACIDS RESEARCH Flanders, D. J., Weng, S. A., Petel, F. X., Cherry, J. M. 1998; 26 (1): 80-84

    Abstract

    AtDB, the Arabidopsis thaliana Database, has a primary role to provide public access to the collected genomic information for A. thaliana via the World Wide Web (URL: http://genome-www.stanford. edu/ ). AtDB presents interactive physical and genetics maps that are hyperlinked with detailed information about the clones and markers placed on these maps. A large literature collection on Arabidopsis , contact information on researchers worldwide, laboratory method manuals and other information useful to plant molecular biologists are also provided. This paper discusses the database-driven clickable displays that provide easy navigation within a variety of genomic maps, including those summarizing progress of the international Arabidopsis genomic sequencing effort, AGI (the Arabidopsis Genome Initiative). The interface uses client-side hyperlinked GIF-images that direct the user to detailed database-information. A new BLAST service is also described. This gives users access to the thousands of Arabidopsis BAC clone end-sequences and includes hyperlinked images summarizing the search results. The linking of genetic and physically mapped regions and their sequence into information for loci within that region is an ongoing goal for this project.

    View details for Web of Science ID 000071778900017

    View details for PubMedID 9399805

  • SGD: Saccharomyces Genome Database NUCLEIC ACIDS RESEARCH Cherry, J. M., Adler, C., Ball, C., Chervitz, S. A., Dwight, S. S., Hester, E. T., Jia, Y. K., Juvik, G., Roe, T., Schroeder, M., Weng, S. A., Botstein, D. 1998; 26 (1): 73-79

    Abstract

    The Saccharomyces Genome Database (SGD) provides Internet access to the complete Saccharomyces cerevisiae genomic sequence, its genes and their products, the phenotypes of its mutants, and the literature supporting these data. The amount of information and the number of features provided by SGD have increased greatly following the release of the S.cerevisiae genomic sequence, which is currently the only complete sequence of a eukaryotic genome. SGD aids researchers by providing not only basic information, but also tools such as sequence similarity searching that lead to detailed information about features of the genome and relationships between genes. SGD presents information using a variety of user-friendly, dynamically created graphical displays illustrating physical, genetic and sequence feature maps. SGD can be accessed via the World Wide Web at http://genome-www.stanford.edu/Saccharomyces/

    View details for Web of Science ID 000071778900016

    View details for PubMedID 9399804

  • Genetics - Yeast as a model organism SCIENCE Botstein, D., Chervitz, S. A., Cherry, J. M. 1997; 277 (5330): 1259-1260

    View details for Web of Science ID A1997XT82700041

    View details for PubMedID 9297238

  • The nucleotide sequence of Saccharomyces cerevisiae chromosome XVI NATURE Bussey, H., Storms, R. K., Ahmed, A., Albermann, K., Allen, E., Ansorge, W., Araujo, R., Aparicio, A., Barrell, B., Badcock, K., Benes, V., Botstein, D., Bowman, S., Bruckner, M., Carpenter, J., Cherry, J. M., Chung, E., Churcher, C., COSTER, F., Davis, K., Davis, R. W., Dietrich, F. S., DELIUS, H., DiPaolo, T., Dubois, E., Dusterhoft, A., Duncan, M., Floeth, M., Fortin, N., Friesen, J. D., Fritz, C., Goffeau, A., Hall, J., Hebling, U., Heumann, K., Hilbert, H., Hillier, L., HunickeSmith, S., HYMAN, R., Johnston, M., Kalman, S., Kleine, K., Komp, C., Kurdi, O., Lashkari, D., Lew, H., Lin, A., LIN, D., Louis, E. J., Marathe, R., Messenguy, F., Mewes, H. W., Mirtipati, S., Moestl, D., MullerAuer, S., Namath, A., Nentwich, U., Oefner, P., Pearson, D., Petel, F. X., Pohl, T. M., Purnelle, B., Rajandream, M. A., Rechmann, S., Rieger, M., Riles, L., Roberts, D., Schafer, M., Scharfe, M., Scherens, B., Schramm, S., Schroder, M., Sdicu, A. M., Tettelin, H., Urrestarazu, L. A., Ushinsky, S., Vierendeels, F., Vissers, S., Voss, H., Walsh, S. V., Wambutt, R., Wang, Y., Wedler, E., Wedler, H., WINNETT, E., Zhong, W. W., Zollner, A., VO, D. H., Hani, J. 1997; 387 (6632): 103-105

    Abstract

    The nucleotide sequence of the 948,061 base pairs of chromosome XVI has been determined, completing the sequence of the yeast genome. Chromosome XVI was the last yeast chromosome identified, and some of the genes mapped early to it, such as GAL4, PEP4 and RAD1 (ref. 2) have played important roles in the development of yeast biology. The architecture of this final chromosome seems to be typical of the large yeast chromosomes, and shows large duplications with other yeast chromosomes. Chromosome XVI contains 487 potential protein-encoding genes, 17 tRNA genes and two small nuclear RNA genes; 27% of the genes have significant similarities to human gene products, and 48% are new and of unknown biological function. Systematic efforts to explore gene function have begun.

    View details for Web of Science ID A1997XB54600015

    View details for PubMedID 9169875

  • The nucleotide sequence of Saccharomyces cerevisiae chromosome IV NATURE Jacq, C., ALTMORBE, J., Andre, B., Arnold, W., Bahr, A., Ballesta, J. P., Bargues, M., Baron, L., Becker, A., Biteau, N., Blocker, H., Blugeon, C., Boskovic, J., Brandt, P., Bruckner, M., Buitrago, M. J., COSTER, F., Delaveau, T., DELREY, F., Dujon, B., Eide, L. G., GarciaCantalejo, J. M., Goffeau, A., GomezPeris, A., Granotier, C., Hanemann, V., Hankeln, T., Hoheisel, J. D., Jager, W., Jimenez, A., Jonniaux, J. L., KRAMER, C., Kuster, H., LAAMANEN, P., Legros, Y., Louis, E., MollerRieker, S., Monnet, A., Moro, M., MullerAuer, S., Nussbaumer, B., Paricio, N., Paulin, L., Perea, J., PEREZALONSO, M., PEREZORTIN, J. E., Pohl, T. M., Prydz, H., Purnelle, B., Rasmussen, S. W., Remacha, M., Revuelta, J. L., Rieger, M., Salom, D., Saluz, H. P., Saiz, J. E., Saren, A. M., Schafer, M., Scharfe, M., Schmidt, E. R., Schneider, C., Scholler, P., Schwarz, S., SolerMira, A., Urrestarazu, L. A., Verhasselt, P., Vissers, S., Voet, M., Volckaert, G., Wagner, G., Wambutt, R., Wedler, E., Wedler, H., Wolfl, S., Harris, D. E., Bowman, S., Brown, D., Churcher, C. M., Connor, R., Dedman, K., Gentles, S., Hamlin, N., Hunt, S., Jones, L., McDonald, S., Murphy, L., Niblett, D., Odell, C., Oliver, K., Rajandream, M. A., Richards, C., Shore, L., Walsh, S. V., Barrell, B. G., Dietrich, F. S., Mulligan, J., Allen, E., Araujo, R., Aviles, E., Berno, O., Carpenter, J., Chen, E., Cherry, J. M., Chung, E., Duncan, M., HunickeSmith, S., HYMAN, R., Komp, C., Lashkari, D., Lew, H., LIN, D., MOSEDALE, D., Nakahara, K., Namath, A., Oefner, P., Oh, C., Petel, F. X., Roberts, D., Schramm, S., Schroeder, M., Shogren, T., Shroff, N., Winant, A., Yelton, M., Botstein, D., Davis, R. W., Johnston, M., Hillier, L., Riles, L., Albermann, K., Hani, J., Heumann, K., Kleine, K., Mewes, H. W., Zollner, A., Zaccaria, P. 1997; 387 (6632): 75-78

    Abstract

    The complete DNA sequence of the yeast Saccharomyces cerevisiae chromosome IV has been determined. Apart from chromosome XII, which contains the 1-2 Mb rDNA cluster, chromosome IV is the longest S. cerevisiae chromosome. It was split into three parts, which were sequenced by a consortium from the European Community, the Sanger Centre, and groups from St Louis and Stanford in the United States. The sequence of 1,531,974 base pairs contains 796 predicted or known genes, 318 (39.9%) of which have been previously identified. Of the 478 new genes, 225 (28.3%) are homologous to previously identified genes and 253 (32%) have unknown functions or correspond to spurious open reading frames (ORFs). On average there is one gene approximately every two kilobases. Superimposed on alternating regional variations in G+C composition, there is a large central domain with a lower G+C content that contains all the yeast transposon (Ty) elements and most of the tRNA genes. Chromosome IV shares with chromosomes II, V, XII, XIII and XV some long clustered duplications which partly explain its origin.

    View details for Web of Science ID A1997XB54600007

    View details for PubMedID 9169867

  • The nucleotide sequence of Saccharomyces cerevisiae chromosome V NATURE Dietrich, F. S., Mulligan, J., Hennessy, K., Yelton, M. A., Allen, E., Araujo, R., Aviles, E., Berno, A., Brennan, T., Carpenter, J., Chen, E., Cherry, J. M., Chung, E., Duncan, M., Guzman, E., Hartzell, G., HunickeSmith, S., Hyman, R. W., Kayser, A., Komp, C., Lashkari, D., Lew, H., LIN, D., MOSEDALE, D., Nakahara, K., Namath, A., Norgren, R., Oefner, P., Oh, C., Petel, F. X., Roberts, D., Sehl, P., Schramm, S., Shogren, T., Smith, V., Taylor, P., Wei, Y., Botstein, D., Davis, R. W. 1997; 387 (6632): 78-81

    Abstract

    Here we report the sequence of 569,202 base pairs of Saccharomyces cerevisiae chromosome V. Analysis of the sequence revealed a centromere, two telomeres and 271 open reading frames (ORFs) plus 13 tRNAs and four small nuclear RNAs. There are two Tyl transposable elements, each of which contains an ORF (included in the count of 271). Of the ORFs, 78 (29%) are new, 81 (30%) have potential homologues in the public databases, and 112 (41%) are previously characterized yeast genes.

    View details for Web of Science ID A1997XB54600008

    View details for PubMedID 9169868

  • Genetic and physical maps of Saccharomyces cerevisiae NATURE Cherry, J. M., Ball, C., Weng, S., Juvik, G., Schmidt, R., Adler, C., Dunn, B., Dwight, S., Riles, L., Mortimer, R. K., Botstein, D. 1997; 387 (6632): 67-73

    Abstract

    Genetic and physical maps for the 16 chromosomes of Saccharomyces cerevisiae are presented. The genetic map is the result of 40 years of genetic analysis. The physical map was produced from the results of an international systematic sequencing effort. The data for the maps are accessible electronically from the Saccharomyces Genome Database (SGD: http://genome-www.stanford. edu/Saccharomyces/).

    View details for Web of Science ID A1997XB54600006

    View details for PubMedID 9169866

  • Molecular linguistics: Extracting information from gene and protein sequences PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Botstein, D., Cherry, J. M. 1997; 94 (11): 5506-5507

    View details for Web of Science ID A1997XB71100005

    View details for PubMedID 9159100

  • Genetic nomenclature guide. Saccharomyces cerevisiae. Trends in genetics : TIG Cherry, J. M. 1995: 11-12

    View details for PubMedID 7660459

  • AN INTEGRATED GENETIC RFLP MAP OF THE ARABIDOPSIS-THALIANA GENOME PLANT JOURNAL HAUGE, B. M., HANLEY, S. M., Cartinhour, S., Cherry, J. M., Goodman, H. M., Koornneef, M., Stam, P., Chang, C., Kempin, S., Medrano, L., Meyerowitz, E. M. 1993; 3 (5): 745-754
  • DETECTION OF HERPES-SIMPLEX VIRUS THYMIDINE KINASE AND LATENCY-ASSOCIATED TRANSCRIPT GENE-SEQUENCES IN HUMAN HERPETIC CORNEAS BY POLYMERASE CHAIN-REACTION AMPLIFICATION INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE RONG, B. L., PAVANLANGSTON, D., Weng, Q. P., Martinez, R., Cherry, J. M., Dunkel, E. C. 1991; 32 (6): 1808-1815

    Abstract

    Herpes simplex virus (HSV) latency in sensory ganglion neurons is well documented, but the existence of extraneuronal corneal latency is less well defined. To investigate the possibility of extraneuronal latency during ocular HSV infection, corneal specimens from 18 patients with quiescent herpes simplex keratitis (HSK) were obtained at the time of keratoplasty. Polymerase chain reaction (PCR) amplification followed by southern blot hybridization with a radiolabeled oligonucleotide probe was done to detect the presence of HSV-1 genome in these human corneal samples. Two pairs of oligonucleotides from the region of the HSV thymidine kinase (TK) gene and the latency-associated transcript (LAT) gene were used as primers in the PCR amplification. The DNA sequences from either the TK or the LAT gene were identified in 15 of 18 HSK corneas (83%). These results demonstrate that the HSV genome was retained, at least in part, in human corneas during quiescent HSV infection, giving further support to the concept of corneal extraneuronal latency.

    View details for Web of Science ID A1991FM17900014

    View details for PubMedID 1851732

  • CODON USAGE TABLE FOR XENOPUS-LAEVIS METHODS IN CELL BIOLOGY Cherry, J. M. 1991; 36: 675-677

    View details for Web of Science ID A1991MC41400038

    View details for PubMedID 1811159

  • SACCHAROMYCES-CEREVISIAE HOMOSERINE KINASE IS HOMOLOGOUS TO PROKARYOTIC HOMOSERINE KINASES GENE Schultes, N. P., Ellington, A. D., Cherry, J. M., Szostak, J. W. 1990; 96 (2): 177-180

    Abstract

    The Saccharomyces cerevisiae gene (THR1) encoding homoserine kinase (HK; EC 2.7.1.39) was cloned by complementation in yeast. Disruption of the THR1 gene results in threonine auxotrophy in yeast. Comparison of the amino acid sequences of yeast and bacterial HKs reveals substantial similarity.

    View details for Web of Science ID A1990EM78200004

    View details for PubMedID 2176637

  • MUTATIONAL ANALYSIS OF CONSERVED NUCLEOTIDES IN A SELF-SPLICING GROUP-I INTRON JOURNAL OF MOLECULAR BIOLOGY COUTURE, S., Ellington, A. D., Gerber, A. S., Cherry, J. M., Doudna, J. A., Green, R., Hanna, M., Pace, U., Rajagopal, J., Szostak, J. W. 1990; 215 (3): 345-358

    Abstract

    We have constructed all single base substitutions in almost all of the highly conserved residues of the Tetrahymena self-splicing intron. Mutation of highly conserved residues almost invariably leads to loss of enzymatic activity. In many cases, activity could be regained by making additional mutations that restored predicted base-pairings; these second site suppressors in general confirm the secondary structure derived from phylogenetic data. At several positions, our suppression data can be most readily explained by assuming non-Watson-Crick base-pairings. In addition to the requirements imposed by the secondary structure, the sequence of the intron is constrained by "negative interactions", the exclusion of particular nucleotide sequences that would form undesirable secondary structures. A comparison of genetic and phylogenetic data suggests sites that may be involved in tertiary structural interactions.

    View details for Web of Science ID A1990ED16700004

    View details for PubMedID 1700131

  • GENETIC DISSECTION OF AN RNA ENZYME COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY Doudna, J. A., Gerber, A. S., Cherry, J. M., Szostak, J. W. 1987; 52: 173-180

    View details for Web of Science ID A1987P094200021

    View details for PubMedID 2456876

  • THE INTERNALLY LOCATED TELOMERIC SEQUENCES IN THE GERM-LINE CHROMOSOMES OF TETRAHYMENA ARE AT THE ENDS OF TRANSPOSON-LIKE ELEMENTS CELL Cherry, J. M., Blackburn, E. H. 1985; 43 (3): 747-758

    Abstract

    The germ-line micronuclear genome of the ciliate Tetrahymena thermophila contains approximately 10(2) chromosome-internal blocks of tandemly repeated C4A2 sequences (mic C4A2). This repeated sequence is the telomeric sequence in the somatic macronucleus. Each of six cloned micC4A2 was found to be adjacent to a conserved 30 bp sequence, which we propose is the terminal inverted repeat of a family of DNA elements (the Tel-1 family). This 30 bp sequence contains a site for the infrequently cutting restriction enzyme Bst XI, which allows full-length Tel-1 elements to be cut out of the micronuclear genome. BAL 31 exonuclease digestion of Bst XI-cut micronuclear DNA showed the majority of micC4A2 blocks to be associated with the ends of the Tel-1 family. We propose that Tel-1 elements are transposable and suggest a novel mechanism to account for the origin of micC4A2, in which telomeric repeats are added to the ends of free linear forms of the transposable elements prior to reintegration.

    View details for Web of Science ID A1985AWV6100022

    View details for PubMedID 3000613

  • DNA termini in ciliate macronuclei. Cold Spring Harbor symposia on quantitative biology Blackburn, E. H., Budarf, M. L., Challoner, P. B., Cherry, J. M., Howard, E. A., Katzen, A. L., Pan, W. C., Ryan, T. 1983; 47: 1195-1207

    View details for PubMedID 6407801

  • DNA TERMINI IN CILIATE MACRONUCLEI COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY Blackburn, E. H., Budarf, M. L., Challoner, P. B., Cherry, J. M., Howard, E. A., Katzen, A. L., Pan, W. C., Ryan, T. 1982; 47: 1195-1207
  • EVIDENCE FOR A PLASMA-MEMBRANE REDOX SYSTEM ON INTACT ASCITES TUMOR-CELLS WITH DIFFERENT METASTATIC CAPACITY BIOCHIMICA ET BIOPHYSICA ACTA Cherry, J. M., MACKELLAR, W., Morre, D. J., CRANE, F. L., Jacobsen, L. B., SCHIRRMACHER, V. 1981; 634 (1): 11-18

    Abstract

    A NADH-ferricyanide reductase of the external surface of intact mouse ascites tumor cells grown in culture was shown. The oxidation/reduction reaction was due to enzymatic rather than inorganic iron catalysis as demonstrated by the kinetics and specificity of the reaction. Activities of three markers for cytoplasmic contents were lacking with the intact tumor cells. The dehydrogenase activity was inhibited by p-chloromercuribenzoate, bathophenanthroline sulfonate, and the anticancer drug adriamycin. Sodium azide and potassium cyanide inhibited partially. The response to inhibitors resembled that of isolated plasma membranes rather than that of mitochondria. Concurrent with these findings, neither superoxide dismutase nor rotenone affected the redox activity. The findings provide evidence for the operation of a plasma membrane redox system at the surface of intact, living cells.

    View details for Web of Science ID A1981KZ18600002

    View details for PubMedID 7470494

  • ABSENCE OF GANGLIOSIDES IN A HIGHER PLANT EXPERIENTIA Cherry, J. M., Buckhout, T. J., Morre, D. J. 1978; 34 (11): 1433-1434

Conference Proceedings


  • The Saccharomyces Genome Database provides comprehensive information about the biology of S-cerevisiae and tools for studies in comparative genomics Hirschman, J. E., Engel, S., Hong, E., Balakrishnan, R., Christie, K., Costanzo, M., Dwight, S., Fisk, D., Nash, R., Park, J., Skrzypek, M., Dolinski, K., Livstone, M., Oughtred, R., Andrada, R., Binkley, G., Dong, Q., Hitz, B., Miyasoto, S., Schroeder, M., Weng, S., Wong, E., Botstein, D., Cherry, J. M. FEDERATION AMER SOC EXP BIOL. 2007: A264-A264
  • Tetrahymena genome database (TGD): a resource for comparative studies with a model protist. Stover, N. A., Krieger, C. J., Binkley, G., Dong, Q., Sethuraman, A., Weng, S., Cherry, J. M. WILEY-BLACKWELL PUBLISHING, INC. 2007: 54S-54S
  • Defining Saccharomyces genes. Cherry, J. M., Theesfeld, C., Sethuraman, A., Fisk, D. G., Dolinski, K., Balakrishnan, R., Binkley, G., Christie, K. R., Costanzo, M., Dong, S., Dwight, S. S., Engel, S., Hirschman, J., Hong, E. L., Issel-Tarver, L., Weng, S., Botstein, D. WILEY-BLACKWELL. 2003: S280-S280
  • The Community Annotation system at the Saccharomyces genome database (SGD). Theesfeld, C. L., Dong, S., Fisk, D. G., Balakrishnan, R., Christie, K. R., Costanzo, M. C., Dolinski, K., Dwight, S. S., Engel, S. R., Hirschman, J. E., Hong, E. L., Issel-Tarver, L., Sethuraman, A., Binkley, G., Weng, S., Botstein, D., Cherry, J. M. WILEY-BLACKWELL. 2003: S345-S345
  • Information resources at SGD: Gene Ontology, Gene Summary Paragraphs, and the Literature Guide. Fisk, D., Christie, K., Dolinski, K., Dwight, S., Issel-Tarver, L., Sethuraman, A., Cherry, J. M., Botstein, D. WILEY-BLACKWELL. 2001: S331-S331
  • Gene Ontology: a controlled vocabulary to describe the function, biological process and cellular location of gene products in genome databases. Shaw, D. R., Ashbumer, M., Blake, J. A., Baldarelli, R. M., Botstein, D., Davis, A. P., Cherry, J. M., Lewis, S., Lutz, C. M., Richardson, J. E., Eppig, J. T. CELL PRESS. 1999: A419-A419
  • Arabidopsis genomic information from AtDB. Cherry, J. M., Flanders, D. J., Petel, F. X., Weng, S. AMER SOC PLANT BIOLOGISTS. 1997: 11003-11003

Stanford Medicine Resources: