Current Role at Stanford
Director of the PharmGKB
Member of the BioMedical Informatics Executive Committee
B.A., UC Santa Cruz, Chemistry/Biology (1980)
Ph.D., UC San Francisco, Medical Information Sciences (1987)
Director of the PharmGKB
Member of the BioMedical Informatics Executive Committee
Tricyclic antidepressant (TCA) clinical pharmacogenetic implementation guidelines for CYP2D6 and CYP2C19 genotypes highlight the importance of both genes. However, studies of the combined impact of the two genes are sparse, limiting the ability to make strong recommendations based on both genes. The warfarin pharmacogenetics literature highlights the strength of a multigenic approach for discovery and clinical implementation. For optimal impact and interpretation, investigators are encouraged to conduct studies in the context of previously well-defined pharmacogenetics markers.
View details for DOI 10.1038/clpt.2013.7
View details for Web of Science ID 000317834800013
View details for PubMedID 23598455
The Pharmacogenomics Knowledgebase (PharmGKB) is a resource that collects, curates, and disseminates information about the impact of human genetic variation on drug responses. It provides clinically relevant information, including dosing guidelines, annotated drug labels, and potentially actionable gene-drug associations and genotype-phenotype relationships. Curators assign levels of evidence to variant-drug associations using well-defined criteria based on careful literature review. Thus, PharmGKB is a useful source of high-quality information supporting personalized medicine-implementation projects.
View details for DOI 10.1038/clpt.2012.96
View details for Web of Science ID 000309017000009
View details for PubMedID 22992668
The CYP2C19*2 loss-of-function allele is associated with reduced generation of active metabolites of clopidogrel. However, meta-analyses have supported or discounted the impact of genotype on adverse cardiovascular outcomes during clopidogrel therapy, depending on studies included in the analysis. Here we review these data and conclude that evidence supports a differential effect of genotype on protection from major adverse cardiovascular outcomes following percutaneous coronary intervention (PCI), but not for other clopidogrel indications.
View details for DOI 10.1038/clpt.2012.21
View details for Web of Science ID 000303047400009
View details for PubMedID 22513313
Codeine is bioactivated to morphine, a strong opioid agonist, by the hepatic cytochrome P450 2D6 (CYP2D6); hence, the efficacy and safety of codeine as an analgesic are governed by CYP2D6 polymorphisms. Codeine has little therapeutic effect in patients who are CYP2D6 poor metabolizers, whereas the risk of morphine toxicity is higher in ultrarapid metabolizers. The purpose of this guideline (periodically updated at http://www.pharmgkb.org) is to provide information relating to the interpretation of CYP2D6 genotype test results to guide the dosing of codeine.
View details for DOI 10.1038/clpt.2011.287
View details for Web of Science ID 000299654000034
View details for PubMedID 22205192
The cost of genomic information has fallen steeply, but the clinical translation of genetic risk estimates remains unclear. We aimed to undertake an integrated analysis of a complete human genome in a clinical context.We assessed a patient with a family history of vascular disease and early sudden death. Clinical assessment included analysis of this patient's full genome sequence, risk prediction for coronary artery disease, screening for causes of sudden cardiac death, and genetic counselling. Genetic analysis included the development of novel methods for the integration of whole genome and clinical risk. Disease and risk analysis focused on prediction of genetic risk of variants associated with mendelian disease, recognised drug responses, and pathogenicity for novel variants. We queried disease-specific mutation databases and pharmacogenomics databases to identify genes and mutations with known associations with disease and drug response. We estimated post-test probabilities of disease by applying likelihood ratios derived from integration of multiple common variants to age-appropriate and sex-appropriate pre-test probabilities. We also accounted for gene-environment interactions and conditionally dependent risks.Analysis of 2.6 million single nucleotide polymorphisms and 752 copy number variations showed increased genetic risk for myocardial infarction, type 2 diabetes, and some cancers. We discovered rare variants in three genes that are clinically associated with sudden cardiac death-TMEM43, DSP, and MYBPC3. A variant in LPA was consistent with a family history of coronary artery disease. The patient had a heterozygous null mutation in CYP2C19 suggesting probable clopidogrel resistance, several variants associated with a positive response to lipid-lowering therapy, and variants in CYP4F2 and VKORC1 that suggest he might have a low initial dosing requirement for warfarin. Many variants of uncertain importance were reported.Although challenges remain, our results suggest that whole-genome sequencing can yield useful and clinically relevant information for individual patients.National Institute of General Medical Sciences; National Heart, Lung And Blood Institute; National Human Genome Research Institute; Howard Hughes Medical Institute; National Library of Medicine, Lucile Packard Foundation for Children's Health; Hewlett Packard Foundation; Breetwor Family Foundation.
View details for Web of Science ID 000277655100025
View details for PubMedID 20435227
BACKGROUND: VKORC1 and CYP2C9 are important contributors to warfarin dose variability, but explain less variability for individuals of African descent than for those of European or Asian descent. We aimed to identify additional variants contributing to warfarin dose requirements in African Americans. METHODS: We did a genome-wide association study of discovery and replication cohorts. Samples from African-American adults (aged ≥18 years) who were taking a stable maintenance dose of warfarin were obtained at International Warfarin Pharmacogenetics Consortium (IWPC) sites and the University of Alabama at Birmingham (Birmingham, AL, USA). Patients enrolled at IWPC sites but who were not used for discovery made up the independent replication cohort. All participants were genotyped. We did a stepwise conditional analysis, conditioning first for VKORC1 -1639G→A, followed by the composite genotype of CYP2C9*2 and CYP2C9*3. We prespecified a genome-wide significance threshold of p<5×10(-8) in the discovery cohort and p<0·0038 in the replication cohort. FINDINGS: The discovery cohort contained 533 participants and the replication cohort 432 participants. After the prespecified conditioning in the discovery cohort, we identified an association between a novel single nucleotide polymorphism in the CYP2C cluster on chromosome 10 (rs12777823) and warfarin dose requirement that reached genome-wide significance (p=1·51×10(-8)). This association was confirmed in the replication cohort (p=5·04×10(-5)); analysis of the two cohorts together produced a p value of 4·5×10(-12). Individuals heterozygous for the rs12777823 A allele need a dose reduction of 6·92 mg/week and those homozygous 9·34 mg/week. Regression analysis showed that the inclusion of rs12777823 significantly improves warfarin dose variability explained by the IWPC dosing algorithm (21% relative improvement). INTERPRETATION: A novel CYP2C single nucleotide polymorphism exerts a clinically relevant effect on warfarin dose in African Americans, independent of CYP2C9*2 and CYP2C9*3. Incorporation of this variant into pharmacogenetic dosing algorithms could improve warfarin dose prediction in this population. FUNDING: National Institutes of Health, American Heart Association, Howard Hughes Medical Institute, Wisconsin Network for Health Research, and the Wellcome Trust.
View details for DOI 10.1016/S0140-6736(13)60681-9
View details for PubMedID 23755828
Many genome-wide association studies focus on associating single loci with target phenotypes. However, in the setting of rare variation, accumulating sufficient samples to assess these associations can be difficult. Moreover, multiple variations in a gene or a set of genes within a pathway may all contribute to the phenotype, suggesting that the aggregation of variations found over the gene or pathway may be useful for improving the power to detect associations.Here, we present a method for aggregating single nucleotide polymorphisms (SNPs) along biologically relevant pathways in order to seek genetic associations with phenotypes. Our method uses all available genetic variants and does not remove those in linkage disequilibrium (LD). Instead, it uses a novel SNP weighting scheme to down-weight the contributions of correlated SNPs. We apply our method to three cohorts of patients taking warfarin: two European descent cohorts and an African American cohort. Although the clinical covariates and key pharmacogenetic loci for warfarin have been characterized, our association metric identifies a significant association with mutations distributed throughout the pathway of warfarin metabolism. We improve dose prediction after using all known clinical covariates and pharmacogenetic variants in VKORC1 and CYP2C9. In particular, we find that at least 1% of the missing heritability in warfarin dose may be due to the aggregated effects of variations in the warfarin metabolic pathway, even though the SNPs do not individually show a significant association.Our method allows researchers to study aggregative SNP effects in an unbiased manner by not preselecting SNPs. It retains all the available information by accounting for LD-structure through weighting, which eliminates the need for LD pruning.
View details for DOI 10.1186/1471-2164-14-S3-S11
View details for Web of Science ID 000319869500011
View details for PubMedID 23819817
The drug-metabolizing enzyme thiopurine methyltransferase (TPMT) has become one of the best examples of pharmacogenomics to be translated into routine clinical practice. TPMT metabolizes the thiopurines 6-mercaptopurine, 6-thioguanine, and azathioprine, drugs that are widely used for treatment of acute leukemias, inflammatory bowel diseases, and other disorders of immune regulation. Since the discovery of genetic polymorphisms in the TPMT gene, many sequence variants that cause a decreased enzyme activity have been identified and characterized. Increasingly, to optimize dose, pretreatment determination of TPMT status before commencing thiopurine therapy is now routine in many countries. Novel TPMT sequence variants are currently numbered sequentially using PubMed as a source of information; however, this has caused some problems as exemplified by two instances in which authors' articles appeared on PubMed at the same time, resulting in the same allele numbers given to different polymorphisms. Hence, there is an urgent need to establish an order and consensus to the numbering of known and novel TPMT sequence variants. To address this problem, a TPMT nomenclature committee was formed in 2010, to define the nomenclature and numbering of novel variants for the TPMT gene. A website (http://www.imh.liu.se/tpmtalleles) serves as a platform for this work. Researchers are encouraged to submit novel TPMT alleles to the committee for designation and reservation of unique allele numbers. The committee has decided to renumber two alleles: nucleotide position 106 (G>A) from TPMT*24 to TPMT*30 and position 611 (T>C, rs79901429) from TPMT*28 to TPMT*31. Nomenclature for all other known alleles remains unchanged.
View details for DOI 10.1097/FPC.0b013e32835f1cc0
View details for Web of Science ID 000316109700009
View details for PubMedID 23407052
Allopurinol is the most commonly used drug for the treatment of hyperuricemia and gout. However, allopurinol is also one of the most common causes of severe cutaneous adverse reactions (SCARs), which include drug hypersensitivity syndrome, Stevens–Johnson syndrome, and toxic epidermal necrolysis. A variant allele of the human leukocyte antigen (HLA)-B, HLA-B*58:01, associates strongly with allopurinolinduced SCAR. We have summarized the evidence from the published literature and developed peer-reviewed guidelines for allopurinol use based on HLA-B genotype.
View details for DOI 10.1038/clpt.2012.209
View details for Web of Science ID 000314139100016
View details for PubMedID 23232549
Since the introduction in the 1950s, warfarin has become the commonly used oral anticoagulant for the prevention of thromboembolism in patients with deep vein thrombosis, atrial fibrillation or prosthetic heart valve replacement. Warfarin is highly efficacious; however, achieving the desired anticoagulation is difficult because of its narrow therapeutic window and highly variable dose response among individuals. Bleeding is often associated with overdose of warfarin. There is overwhelming evidence that an individual's warfarin maintenance is associated with clinical factors and genetic variations, most notably polymorphisms in cytochrome P450 2C9 and vitamin K epoxide reductase subunit 1. Numerous dose-prediction algorithms incorporating both genetic and clinical factors have been developed and tested clinically. However, results from major clinical trials are not available yet. This review aims to provide an overview of the field of warfarin which includes information about the drug, genetics of warfarin dose requirements, dosing algorithms developed and the challenges for the clinical implementation of warfarin pharmacogenetics.Journal of Human Genetics advance online publication, 9 May 2013; doi:10.1038/jhg.2013.40.
View details for PubMedID 23657428
View details for PubMedID 23808970
View details for PubMedID 23708745
The Pharmacogenomics Knowledge Base, PharmGKB, is an interactive tool for researchers investigating how genetic variation affects drug response. The PharmGKB Web site, http://www.pharmgkb.org , displays genotype, molecular, and clinical knowledge integrated into pathway representations and Very Important Pharmacogene (VIP) summaries with links to additional external resources. Users can search and browse the knowledgebase by genes, variants, drugs, diseases, and pathways. Registration is free to the entire research community, but subject to agreement to use for research purposes only and not to redistribute. Registered users can access and download data to aid in the design of future pharmacogenetics and pharmacogenomics studies.
View details for DOI 10.1007/978-1-62703-435-7_20
View details for PubMedID 23824865
View details for PubMedID 23588301
Although there is increasing evidence to support the implementation of pharmacogenetics in certain clinical scenarios, the adoption of this approach has been limited. The advent of preemptive and inexpensive testing of critical pharmacogenetic variants may overcome barriers to adoption. We describe the design of a customized array built for the personalized-medicine programs of the University of Florida and Stanford University. We selected key variants for the array using the clinical annotations of the Pharmacogenomics Knowledgebase (PharmGKB), and we included variants in drug metabolism and transporter genes along with other pharmacogenetically important variants.
View details for DOI 10.1038/clpt.2012.125
View details for Web of Science ID 000309017000017
View details for PubMedID 22910441
Cholesterol reduction from statin therapy has been one of the greatest public health successes in modern medicine. Simvastatin is among the most commonly used prescription medications. A non-synonymous coding single-nucleotide polymorphism (SNP), rs4149056, in SLCO1B1 markedly increases systemic exposure to simvastatin and the risk of muscle toxicity. This guideline explores the relationship between rs4149056 (c.521T>C, p.V174A) and clinical outcome for all statins. The strength of the evidence is high for myopathy with simvastatin. We limit our recommendations accordingly.
View details for DOI 10.1038/clpt.2012.57
View details for Web of Science ID 000305589800019
View details for PubMedID 22617227
The need for efficient text-mining tools that support curation of the biomedical literature is ever increasing. In this article, we describe an experiment aimed at verifying whether a text-mining tool capable of extracting meaningful relationships among domain entities can be successfully integrated into the curation workflow of a major biological database. We evaluate in particular (i) the usability of the system's interface, as perceived by users, and (ii) the correlation of the ranking of interactions, as provided by the text-mining system, with the choices of the curators.
View details for DOI 10.1093/database/bas021
View details for Web of Science ID 000304924100001
View details for PubMedID 22529178
Human leukocyte antigen B (HLA-B) is responsible for presenting peptides to immune cells and plays a critical role in normal immune recognition of pathogens. A variant allele, HLA-B*57:01, is associated with increased risk of a hypersensitivity reaction to the anti-HIV drug abacavir. In the absence of genetic prescreening, hypersensitivity affects ~6% of patients and can be life-threatening with repeated dosing. We provide recommendations (updated periodically at http://www.pharmkgb.org) for the use of abacavir based on HLA-B genotype.
View details for DOI 10.1038/clpt.2011.355
View details for Web of Science ID 000301891800026
View details for PubMedID 22378157
Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.
View details for DOI 10.1016/j.cell.2012.02.009
View details for Web of Science ID 000301889500023
View details for PubMedID 22424236
The mission of the Pharmacogenomics Knowledge Base (PharmGKB; www.pharmgkb.org ) is to collect, encode and disseminate knowledge about the impact of human genetic variations on drug responses. It is an important worldwide resource of clinical pharmacogenomic biomarkers available to all. The PharmGKB website has evolved to highlight our knowledge curation and aggregation over our previous emphasis on collecting primary data. This review summarizes the methods we use to drive this expanded scope of 'Knowledge Acquisition to Clinical Applications', the new features available on our website and our future goals.
View details for DOI 10.2217/BMM.11.94
View details for Web of Science ID 000298488200009
View details for PubMedID 22103613
Warfarin is a widely used anticoagulant with a narrow therapeutic index and large interpatient variability in the dose required to achieve target anticoagulation. Common genetic variants in the cytochrome P450-2C9 (CYP2C9) and vitamin K-epoxide reductase complex (VKORC1) enzymes, in addition to known nongenetic factors, account for ~50% of warfarin dose variability. The purpose of this article is to assist in the interpretation and use of CYP2C9 and VKORC1 genotype data for estimating therapeutic warfarin dose to achieve an INR of 2-3, should genotype results be available to the clinician. The Clinical Pharmacogenetics Implementation Consortium (CPIC) of the National Institutes of Health Pharmacogenomics Research Network develops peer-reviewed gene-drug guidelines that are published and updated periodically on http://www.pharmgkb.org based on new developments in the field.(1).
View details for DOI 10.1038/clpt.2011.185
View details for Web of Science ID 000295119200035
View details for PubMedID 21900891
Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.
View details for DOI 10.1371/journal.pgen.1002280
View details for Web of Science ID 000295419100031
View details for PubMedID 21935354
Thiopurine methyltransferase (TPMT) activity exhibits monogenic co-dominant inheritance, with ethnic differences in the frequency of occurrence of variant alleles. With conventional thiopurine doses, homozygous TPMT-deficient patients (~1 in 178 to 1 in 3,736 individuals with two nonfunctional TPMT alleles) experience severe myelosuppression, 30-60% of individuals who are heterozygotes (~3-14% of the population) show moderate toxicity, and homozygous wild-type individuals (~86-97% of the population) show lower active thioguanine nucleolides and less myelosuppression. We provide dosing recommendations (updates at http://www.pharmgkb.org) for azathioprine, mercaptopurine (MP), and thioguanine based on TPMT genotype.
View details for DOI 10.1038/clpt.2010.320
View details for Web of Science ID 000287439600018
View details for PubMedID 21270794
Collagen is a ubiquitous extracellular matrix protein. Its biological functions, including maintenance of the structural integrity of tissues, depend on its multiscale, hierarchical structure. Three elongated, twisted peptide chains of > 1000 amino acids each assemble into trimeric proteins characterized by the defining triple helical domain. The trimers associate into fibrils, which pack into fibers. We conducted a 10 ns molecular dynamics simulation of the full-length triple helical domain, which was made computationally feasible by segmenting the protein into overlapping fragments. The calculation included ~1.8 million atoms, including solvent, and took approximately 11 months using the CPUs of over a quarter of a million computers. Specialized analysis protocols and a relational database were developed to process the large amounts of data, which are publicly available. The simulated structures exhibit heterogeneity in the triple helical domain consistent with experimental results but at higher resolution. The structures serve as the foundation for studies of higher order forms of the protein and for modeling the effects of disease-associated mutations.
View details for PubMedID 21121047
Warfarin-dosing algorithms incorporating CYP2C9 and VKORC1 -1639G>A improve dose prediction compared with algorithms based solely on clinical and demographic factors. However, these algorithms better capture dose variability among whites than Asians or blacks. Herein, we evaluate whether other VKORC1 polymorphisms and haplotypes explain additional variation in warfarin dose beyond that explained by VKORC1 -1639G>A among Asians (n = 1103), blacks (n = 670), and whites (n = 3113). Participants were recruited from 11 countries as part of the International Warfarin Pharmacogenetics Consortium effort. Evaluation of the effects of individual VKORC1 single nucleotide polymorphisms (SNPs) and haplotypes on warfarin dose used both univariate and multi variable linear regression. VKORC1 -1639G>A and 1173C>T individually explained the greatest variance in dose in all 3 racial groups. Incorporation of additional VKORC1 SNPs or haplotypes did not further improve dose prediction. VKORC1 explained greater variability in dose among whites than blacks and Asians. Differences in the percentage of variance in dose explained by VKORC1 across race were largely accounted for by the frequency of the -1639A (or 1173T) allele. Thus, clinicians should recognize that, although at a population level, the contribution of VKORC1 toward dose requirements is higher in whites than in nonwhites; genotype predicts similar dose requirements across racial groups.
View details for DOI 10.1182/blood-2009-12-255992
View details for Web of Science ID 000277335900027
View details for PubMedID 20203262
DNATwist is a Web-based learning tool (available at http://www.dnatwist.org) that explains pharmacogenomics concepts to middle- and high-school students. Its features include (i) a focus on drug responses of interest to teenagers (e.g., alcohol intolerance), (ii) reusable graphical interfaces that reduce extension costs, and (iii) explanations of molecular and cellular drug responses. In testing, students found the tool and topic understandable and engaging. The tool is being modified for use at the Tech Museum of Innovation in California.
View details for DOI 10.1038/clpt.2009.303
View details for Web of Science ID 000276506900009
View details for PubMedID 20305671
The NIH initiated the PharmGKB in April 2000. The primary mission was to create a repository of primary data, tools to track associations between genes and drugs, and to catalog the location and frequency of genetic variations known to impact drug response. Over the past 10 years, new technologies have shifted research from candidate gene pharmacogenetics to phenotype-based pharmacogenomics with a consequent explosion of data. PharmGKB has refocused on curating knowledge rather than housing primary genotype and phenotype data, and now, captures more complex relationships between genes, variants, drugs, diseases and pathways. Going forward, the challenges are to provide the tools and knowledge to plan and interpret genome-wide pharmacogenomics studies, predict gene-drug relationships based on shared mechanisms and support data-sharing consortia investigating clinical applications of pharmacogenomics.
View details for DOI 10.2217/PGS.10.15
View details for Web of Science ID 000276769300008
View details for PubMedID 20350130
The development of robust and clinically valuable pharmacogenomic tests has been anticipated to be one of the first tangible results of the Human Genome Project. Despite both obvious and unanticipated obstacles, a number of tests have now become available in various practice settings. Lessons can be learned from examination of these tests, the evidence that has catalyzed their use, their value to prescribers, and their merit as tools for personalizing therapeutics.
View details for DOI 10.1038/clpt.2009.39
View details for Web of Science ID 000267225200021
View details for PubMedID 19369936
Fibrillar collagens are ubiquitous proteins essential for the structural integrity of bones, skin, blood vessels, and other tissues. Mutations in collagen genes result in disorders including osteogenesis imperfecta, chondrodysplasias, and Ehlers-Danlos syndromes, but the molecular basis for the heterogeneity of clinical phenotypes is not well understood. A more complete understanding of the relationship between sequence and phenotype requires synthesis of multiple facets of collagen structure and function. To facilitate such an analysis, we developed COLdb, a freely available database integrating collagen biological and physicochemical properties with known variants. A Web-based, interactive, graphical user interface displays the data as annotations on the collagen protein sequences. Collagen gene-level data are provided as custom tracks for display in the UCSC genome browser. COLdb currently includes 35,582 data points spanning collagen types I, II, and III, and, importantly, users can add their own data to the display. The database is the first comprehensive integration of disparate functional information on the three major fibrillar collagens, and the first electronic collection of mutations in the COL2A1 gene.
View details for DOI 10.1002/humu.20978
View details for Web of Science ID 000267635100012
View details for PubMedID 19370761
Genetic variability among patients plays an important role in determining the dose of warfarin that should be used when oral anticoagulation is initiated, but practical methods of using genetic information have not been evaluated in a diverse and large population. We developed and used an algorithm for estimating the appropriate warfarin dose that is based on both clinical and genetic data from a broad population base.Clinical and genetic data from 4043 patients were used to create a dose algorithm that was based on clinical variables only and an algorithm in which genetic information was added to the clinical variables. In a validation cohort of 1009 subjects, we evaluated the potential clinical value of each algorithm by calculating the percentage of patients whose predicted dose of warfarin was within 20% of the actual stable therapeutic dose; we also evaluated other clinically relevant indicators.In the validation cohort, the pharmacogenetic algorithm accurately identified larger proportions of patients who required 21 mg of warfarin or less per week and of those who required 49 mg or more per week to achieve the target international normalized ratio than did the clinical algorithm (49.4% vs. 33.3%, P<0.001, among patients requiring < or = 21 mg per week; and 24.8% vs. 7.2%, P<0.001, among those requiring > or = 49 mg per week).The use of a pharmacogenetic algorithm for estimating the appropriate initial dose of warfarin produces recommendations that are significantly closer to the required stable therapeutic dose than those derived from a clinical algorithm or a fixed-dose approach. The greatest benefits were observed in the 46.2% of the population that required 21 mg or less of warfarin per week or 49 mg or more per week for therapeutic anticoagulation.
View details for Web of Science ID 000263411300005
View details for PubMedID 19228618
Osteogenesis imperfecta (OI), also known as brittle bone disease, is a clinically and genetically heterogeneous disorder primarily characterized by susceptibility to fracture. Although OI generally results from mutations in the type I collagen genes, COL1A1 and COL1A2, the relationship between genotype and phenotype is not yet well understood. To provide additional data for genotype-phenotype analyses and to determine the proportion of mutations in the type I collagen genes among subjects with lethal forms of OI, we sequenced the coding and exon-flanking regions of COL1A1 and COL1A2 in a cohort of 63 subjects with OI type II, the perinatal lethal form of the disease. We identified 61 distinct heterozygous mutations in type I collagen, including five non-synonymous rare variants of unknown significance, of which 43 had not been seen previously. In addition, we found 60 SNPs in COL1A1, of which 17 were not reported previously, and 82 in COL1A2, of which 18 are novel. In three samples without collagen mutations, we found inactivating mutations in CRTAP and LEPRE1, suggesting a frequency of these recessive mutations of approximately 5% in OI type II. A computational model that predicts the outcome of substitutions for glycine within the triple helical domain of collagen alpha1(I) chains predicted lethality with approximately 90% accuracy. The results contribute to the understanding of the etiology of OI by providing data to evaluate and refine current models relating genotype to phenotype and by providing an unbiased indication of the relative frequency of mutations in OI-associated genes.
View details for DOI 10.1093/hmg/ddn374
View details for Web of Science ID 000262519300007
View details for PubMedID 18996919
View details for PubMedID 20161212
The PharmGKB is a publicly available online resource that aims to facilitate understanding how genetic variation contributes to variation in drug response. It is not only a repository of pharmacogenomics primary data, but it also provides fully curated knowledge including drug pathways, annotated pharmacogene summaries, and relationships among genes, drugs, and diseases. This unit describes how to navigate the PharmGKB Web site to retrieve detailed information on genes and important variants, as well as their relationship to drugs and diseases. It also includes protocols on our drug-centered pathway, annotated pharmacogene summaries, and our Web services for downloading the underlying data. Workflow on how to use PharmGKB to facilitate design of the pharmacogenomic study is also described in this unit.
View details for DOI 10.1002/0471250953.bi1407s23
View details for PubMedID 18819074
Osteogenesis imperfecta (OI), or brittle bone disease, often results from missense mutation of one of the conserved glycine residues present in the repeating Gly-X-Y sequence characterizing the triple-helical region of type I collagen. A composite model was developed for predicting the clinical lethality resulting from glycine mutations in the alpha1 chain of type I collagen. The lethality of mutations in which bulky amino acids are substituted for glycine is predicted by their position relative to the N-terminal end of the triple helix. The effect of a Gly --> Ser mutation is modeled by the relative thermostability of the Gly-X-Y triplet on the carboxy side of the triplet containing the substitution. This model also predicts the lethality of Gly --> Ser and Gly --> Cys mutations in the alpha2 chain of type I collagen. The model was validated with an independent test set of six novel Gly --> Ser mutations. The hypothesis derived from the model of an asymmetric interaction between a Gly --> Ser mutation and its neighboring residues was tested experimentally using collagen-like peptides. Consistent with the prediction, a significant decrease in stability, calorimetric enthalpy, and folding time was observed for a peptide with a low-stability triplet C-terminal to the mutation compared to a similar peptide with the low-stability triplet on the N-terminal side. The computational and experimental results together relate the position-specific effects of Gly --> Ser mutations to the local structural stability of collagen and lend insight into the etiology of OI.
View details for DOI 10.1021/bi800026k
View details for Web of Science ID 000255547600018
View details for PubMedID 18412368
Collagens are members of one of the most important families of structural proteins in higher organisms. There are 28 types of collagens encoded by 43 genes in humans that fall into several different functional protein classes. Mutations in the major fibrillar collagen genes lead to osteogenesis imperfecta (COL1A1 and COL1A2 encoding the chains of Type I collagen), chondrodysplasias (COL2A1 encoding the chains of Type II collagen), and vascular Ehlers-Danlos syndrome (COL3A1 encoding the chains of Type III collagen). Over the past 2 decades, mutations in these collagen genes have been catalogued, in hopes of understanding the molecular etiology of diseases caused by these mutations, characterizing the genotype-phenotype relationships, and developing robust models predicting the molecular and clinical outcomes. To achieve these goals better, it is necessary to understand the natural patterns of variation in collagen genes in human populations. We screened exons, flanking intronic regions, and conserved noncoding regions for variations in COL1A1, COL1A2, COL2A1, and COL3A1 in 48 individuals from each of four ethnically diverse populations. We identified 459 single-nucleotide polymorphisms (SNPs), more than half of which were novel and not found in public databases. Of the 52 SNPs found in coding regions, 15 caused amino acid substitutions while 37 did not. Although the four collagens have similar gene and protein structures, they have different molecular evolutionary characteristics. For example, COL1A1 appears to have been under substantially stronger negative selection than the rest. Phylogenetic analysis also suggests that the four genes have very different evolutionary histories among the different ethnic groups. Our observations suggest that the study of collagen mutations and their relationships with disease phenotypes should be performed in the context of the genetic background of the subjects.
View details for DOI 10.1016/j.ygeno.2007.12.008
View details for Web of Science ID 000255386300001
View details for PubMedID 18272325
PharmGKB, the pharmacogenetics and pharmacogenomics knowledge base (www.pharmgkb.org) is a publicly available online resource dedicated to the dissemination of how genetic variation leads to variation in drug responses. The goals of PharmGKB are to describe relationships between genes, drugs, and diseases, and to generate knowledge to catalyze pharmacogenetic and pharmacogenomic research. PharmGKB delivers knowledge in the form of curated literature annotations, drug pathway diagrams, and very important pharmacogene (VIP) summaries. Recently, PharmGKB has embraced a new role--broker of pharmacogenomic data for data sharing consortia. In particular, we have helped create the International Warfarin Pharmacogenetics Consortium (IWPC), which is devoted to pooling genotype and phenotype data relevant to the anticoagulant warfarin. PharmGKB has embraced the challenge of continuing to maintain its original mission while taking an active role in the formation of pharmacogenetic consortia.
View details for DOI 10.1002/humu.20731
View details for Web of Science ID 000254800400002
View details for PubMedID 18330919
Recent advances in high-throughput genotyping and phenotyping have accelerated the creation of pharmacogenomic data. Consequently, the community requires standard formats to exchange large amounts of diverse information. To facilitate the transfer of pharmacogenomics data between databases and analysis packages, we have created a standard XML (eXtensible Markup Language) schema that describes both genotype and phenotype data as well as associated metadata. The schema accommodates information regarding genes, drugs, diseases, experimental methods, genomic/RNA/protein sequences, subjects, subject groups, and literature. The Pharmacogenetics and Pharmacogenomics Knowledge Base (PharmGKB; www.pharmgkb.org) has used this XML schema for more than 5 years to accept and process submissions containing more than 1,814,139 SNPs on 20,797 subjects using 8,975 assays. Although developed in the context of pharmacogenomics, the schema is of general utility for exchange of genotype and phenotype data. We have written syntactic and semantic validators to check documents using this format. The schema and code for validation is available to the community at http://www.pharmgkb.org/schema/index.html (last accessed: 8 October 2007).
View details for DOI 10.1002/humu.20662
View details for Web of Science ID 000253033000002
View details for PubMedID 17994540
PharmGKB is a knowledge base that captures the relationships between drugs, diseases/phenotypes and genes involved in pharmacokinetics (PK) and pharmacodynamics (PD). This information includes literature annotations, primary data sets, PK and PD pathways, and expert-generated summaries of PK/PD relationships between drugs, diseases/phenotypes and genes. PharmGKB's website is designed to effectively disseminate knowledge to meet the needs of our users. PharmGKB currently has literature annotations documenting the relationship of over 500 drugs, 450 diseases and 600 variant genes. In order to meet the needs of whole genome studies, PharmGKB has added new functionalities, including browsing the variant display by chromosome and cytogenetic locations, allowing the user to view variants not located within a gene. We have developed new infrastructure for handling whole genome data, including increased methods for quality control and tools for comparison across other data sources, such as dbSNP, JSNP and HapMap data. PharmGKB has also added functionality to accept, store, display and query high throughput SNP array data. These changes allow us to capture more structured information on phenotypes for better cataloging and comparison of data. PharmGKB is available at www.pharmgkb.org.
View details for DOI 10.1093/nar/gkm1009
View details for Web of Science ID 000252545400160
View details for PubMedID 18032438
The Pharmacogenetics and Pharmacogenomics Knowledge Base (PharmGKB: http://www.pharmgkb.org) is devoted to disseminating primary data and knowledge in pharmacogenetics and pharmacogenomics. We are annotating the genes that are most important for drug response and present this information in the form of Very Important Pharmacogene (VIP) summaries, pathway diagrams, and curated literature. The PharmGKB currently contains information on over 500 drugs, 500 diseases, and 700 genes with genotyped variants. New features focus on capturing the phenotypic consequences of individual genetic variants. These features link variant genotypes to phenotypes, increase the breadth of pharmacogenomics literature curated, and visualize single-nucleotide polymorphisms on a gene's three-dimensional protein structure.
View details for DOI 10.1080/03602530802413338
View details for Web of Science ID 000260325500002
View details for PubMedID 18949600
Folding and misfolding of the collagen triple helix are studied through molecular dynamics simulations of two collagenlike peptides, [(POG)(10)](3) and [(POG)(4)POA(POG)(5)](3), which are models for wild-type and mutant collagen, respectively. To extract long time dynamics from short trajectories, we employ Markov state models. By analyzing thermodynamic and kinetic quantities calculated from the Markov state models, we examine folding mechanisms of the collagen triple helix and consequences of glycine mutations. We find that the C-to-N zipping of the collagen triple helix must be initiated by a nucleation event consisting of formation of three stable hydrogen bonds, and that zipping through a glycine mutation site requires a renucleation event which also consists of formation of three stable hydrogen bonds. Our results also suggest that slow kinetics, rather than free energy differences, is mainly responsible for the stability of the collagen triple helix.
View details for DOI 10.1529/biophysj.107.108100
View details for Web of Science ID 000251298100006
View details for PubMedID 17766343
The pharmacogenetics and pharmacogenomics knowledge base (PharmGKB, http://www.pharmgkb.org) is a publicly available internet resource dedicated to the integration, annotation, and aggregation of pharmacogenomic knowledge. PharmGKB is a repository for pharmacogenetic and pharmacogenomic data, and curators provide integrated knowledge in terms of gene summaries, pathways, and annotated literature. Although PharmGKB is primarily directed toward catalyzing new research, it also has utility as a source of information for education about pharmacogenomics.
View details for DOI 10.1038/sj.clpt.6100332
View details for Web of Science ID 000249636500024
View details for PubMedID 17713470
The Stanford Biomedical Informatics training program began with a focus on clinical informatics, and has now evolved into a general program of biomedical informatics training, including clinical informatics, bioinformatics and imaging informatics. The program offers PhD, MS, distance MS, certificate programs, and is now affiliated with an undergraduate major in biomedical computation. Current dynamics include (1) increased activity in informatics within other training programs in biology and the information sciences (2) increased desire among informatics students to gain laboratory experience, (3) increased demand for computational collaboration among biomedical researchers, and (4) interaction with the newly formed Department of Bioengineering at Stanford University. The core focus on research training-the development and application of novel informatics methods for biomedical research-keeps the program centered in the midst of this period of growth and diversification.
View details for DOI 10.1016/j.jbi.2006.02.005
View details for Web of Science ID 000243216000007
View details for PubMedID 16564233
The Pharmacogenetics and Pharmacogenomics Knowledge Base, PharmGKB (http://www.pharmgkb.org), curates pharmacogenetic and pharmacogenomic information to generate knowledge concerning the relationships among genes, drugs, and diseases, and the effects of gene variation on these relationships. PharmGKB curators collect information on genotype-phenotype relationships both from the literature and from the deposition of primary research data into our database. Their goal is to catalyze pharmacogenetic and pharmacogenomic research.
View details for DOI 10.1038/sj.clpt.6100048
View details for Web of Science ID 000242874200010
View details for PubMedID 17185992
With the completion of the Human Genome Project, a new emphasis is focusing on the sequence variation and the resulting phenotype. The number of data available from genomic studies addressing this relationship is rapidly growing. In order to analyze these data as a whole, they need to be integrated, aggregated and annotated in a timely manner. The Pharmacogenetics and Pharmacogenomics Knowledge Base PharmGKB; (
View details for Web of Science ID 000243893500009
View details for PubMedID 17233563
View details for PubMedID 16610446
In this study, we examine the relationships between the structure and stability of five related collagen-like molecules that have hydroxyproline residues occupying positions not observed in vertebrate collagen. Two of the molecules contain valine or threonine and form stable triple helices in water. Three of the molecules contain allo-threonine (an enantiomer of threonine), serine, or alanine, and are not stable. Using molecular dynamics simulation methods, we examine possible explanations for the stability difference, including considering the possibility that differences in solvent shielding of the essential interchain hydrogen bonds may result in differences in stability. By comparing the structures of threonine- and allo-threonine-containing molecules in six polar and nonpolar solvation conditions, we find that solvent shielding is not an adequate explanation for the stability difference. A closer examination of the peptides shows that the structures of the unstable molecules are looser, having weaker intermolecular hydrogen bonds. The weakened hydrogen bonds result from extended Yaa residue Psi-angles that prevent optimal geometry. The Phi-Psi-maps of the relevant residues suggest that each residue's most favorable Psi-angle determines the corresponding collagen-like molecule's stability. Additionally, we propose that these molecules illustrate a more general feature of triple-helical structures: interchain hydrogen bonds are always longer and weaker than ideal, so they are sensitive to relatively small changes in molecular structure. This sensitivity to small changes may explain why large stability differences often result from seemingly small changes in residue sequence.
View details for DOI 10.1529/biophysj.105.065276
View details for Web of Science ID 000234252100018
View details for PubMedID 16258051
Recently, the importance of proline ring pucker conformations in collagen has been suggested in the context of hydroxylation of prolines. The previous molecular mechanics parameters for hydroxyproline, however, do not reproduce the correct pucker preference. We have developed a new set of parameters that reproduces the correct pucker preference. Our molecular dynamics simulations of proline and hydroxyproline monomers as well as collagen-like peptides, using the new parameters, support the theory that the role of hydroxylation in collagen is to stabilize the triple helix by adjusting to the right pucker conformation (and thus the right phi angle) in the Y position.
View details for DOI 10.1002/jcc.20301
View details for Web of Science ID 000232570300006
View details for PubMedID 16170799
Biomedical databases summarize current scientific knowledge, but they generally require years of laborious curation effort to build, focusing on identifying pertinent literature and data in the voluminous biomedical literature. It is difficult to manually extract useful information embedded in the large volumes of literature, and automated intelligent text analysis tools are becoming increasingly essential to assist in these curation activities. The goal of the authors was to develop an automated method to identify articles in Medline citations that contain pharmacogenetics data pertaining to gene-drug relationships.The authors built and evaluated several candidate statistical models that characterize pharmacogenetics articles in terms of word usage and the profile of Medical Subject Headings (MeSH) used in those articles. The best-performing model was used to scan the entire Medline article database (11 million articles) to identify candidate pharmacogenetics articles.A sampling of the articles identified from scanning Medline was reviewed by a pharmacologist to assess the precision of the method. The authors' approach identified 4,892 pharmacogenetics articles in the literature with 92% precision. Their automated method took a fraction of the time to acquire these articles compared with the time expected to be taken to accumulate them manually. The authors have built a Web resource (http://pharmdemo.stanford.edu/pharmdb/main.spy) to provide access to their results.A statistical classification approach can screen the primary literature to pharmacogenetics articles with high precision. Such methods may assist curators in acquiring pertinent literature in building biomedical databases.
View details for DOI 10.1197/jamia.M1640
View details for Web of Science ID 000227842000003
View details for PubMedID 15561790
The Pharmacogenetics and Pharmacogenomics Knowledge Base (PharmGKB) is an interactive tool for researchers investigating how genetic variation effects drug response. The PharmGKB web site, www.pharmgkb.org, displays genotype, molecular, and clinical primary data integrated with literature, pathway representations, protocol information, and links to additional external resources. Users can search and browse the knowledge base by genes, drugs, diseases, and pathways. Registration is free to the entire research community but subject to an agreement to respect the rights and privacy of the individuals whose information is contained within the database. Registered users can access and download primary data to aid in the design of future pharmacogenetics and pharmacogenomics studies.
View details for PubMedID 16100408
We show that there are correlations between the severities of osteogenesis imperfecta (OI) phenotypes and changes in the residues near the mutation site. Our results show the correlations between the severity of various forms of the inherited disease OI and alteration of residues near the site of OI causing mutations. Among our many observed correlations are particularly striking ones between the presence of nearby proline residues and lethal mutations, and the presence of nearby alanines residues and nonlethal mutations. We investigated the possibility that these correlations have a structural basis using molecular dynamics simulations of collagen-like molecules designed to mimic the site of a lethal OI mutation in collagen type I. Our significant finding is that interchain hydrogen bonding is greatly affected by variations in residue type. We found that the strength of hydrogen bond networks between backbone atoms on different chains depends on the local residue sequence and is weaker in proline-rich regions of the molecule. We also found that an alanine at a site near an OI mutation causes less structural disruption than a proline, and that residue side chains also form interchain hydrogen bonds with frequencies that are dependent on residue type. For example, arginine side chains form strong hydrogen bonds with the backbone of the subsequent peptide chain, while lysine and glutamine less frequently form similar hydrogen bonds. This decrease in the observed hydrogen bond frequency correlates with a decrease in the experimentally determined thermal stability. We contrasted general structural properties of model collagen peptides with and without the mutation to examine the effect of the single-point mutation on the surrounding residues.
View details for DOI 10.1021/bi035676w
View details for Web of Science ID 000221343200021
View details for PubMedID 15122897
To determine how genetic variations contribute the variations in drug response, we need to know the genes that are related to drugs of interest. But there are no publicly available data-bases of known gene-drug relationships, and it is time-consuming to search the literature for this information. We have developed a resource to support the storage, summarization, and dissemination of key gene-drug interactions of relevance to pharmacogenetics. Extracting all gene-drug relationships from the literature is a daunting task, so we distributed a tool to acquire this knowledge from the scientific community. We also developed a categorization scheme to classify gene-drug relationships according to the type of pharmacogenetic evidence that supports them. Our resource (http://www.pharmgkb.org/home/project-community.jsp) can be queried by gene or drug, and it summarizes gene-drug relationships, categories of evidence, and supporting literature. This resource is growing, containing entries for 138 genes and 215 drugs of pharmacogenetics significance, and is a core component of PharmGKB, a pharmacogenetics knowledge base (http://www.pharmgkb.org).
View details for Web of Science ID 000226723300159
View details for PubMedID 15360921
Interactions with magnesium (Mg2+) ions are essential for RNA folding and function. The locations and function of bound Mg2+ ions are difficult to characterize both experimentally and computationally. In particular, the P456 domain of the Tetrahymena thermophila group I intron, and a 58 nt 23s rRNA from Escherichia coli have been important systems for studying the role of Mg2+ binding in RNA, but characteristics of all the binding sites remain unclear. We therefore investigated the Mg2+ binding capabilities of these RNA systems using a computational approach to identify and further characterize their Mg2+ binding sites. The approach is based on the FEATURE algorithm, reported previously for microenvironment analysis of protein functional sites. We have determined novel physicochemical descriptions of site-bound and diffusely bound Mg2+ ions in RNA that are useful for prediction. Electrostatic calculations using the Non-Linear Poisson Boltzmann (NLPB) equation provided further evidence for the locations of site-bound ions. We confirmed the locations of experimentally determined sites and further differentiated between classes of ion binding. We also identified potentially important, high scoring sites in the group I intron that are not currently annotated as Mg2+ binding sites. We note their potential function and believe they deserve experimental follow-up.
View details for DOI 10.1093/nar/gkg471
View details for Web of Science ID 000184532900029
View details for PubMedID 12888505
WebFEATURE (http://feature.stanford.edu/webfeature/) is a web-accessible structural analysis tool that allows users to scan query structures for functional sites in both proteins and nucleic acids. WebFEATURE is the public interface to the scanning algorithm of the FEATURE package, a supervised learning algorithm for creating and identifying 3D, physicochemical motifs in molecular structures. Given an input structure or Protein Data Bank identifier (PDB ID), and a statistical model of a functional site, WebFEATURE will return rank-scored 'hits' in 3D space that identify regions in the structure where similar distributions of physicochemical properties occur relative to the site model. Users can visualize and interactively manipulate scored hits and the query structure in web browsers that support the Chime plug-in. Alternatively, results can be downloaded and visualized through other freely available molecular modeling tools, like RasMol, PyMOL and Chimera. A major application of WebFEATURE is in rapid annotation of function to structures in the context of structural genomics.
View details for DOI 10.1093/nar/gkg553
View details for Web of Science ID 000183832900010
View details for PubMedID 12824318
Mutations in the androgen receptor (AR) are associated with a variety of diseases including androgen insensitivity syndrome and prostate cancer, but the way in which these mutations cause disease is poorly understood. We present a method for distinguishing likely disease-causing mutations from mutations that are merely associated with disease but have no causal role. Our method uses a measure of nucleotide conservation, and we find that conservation often correlates with severity of the clinical phenotype. Further, by only including mutations whose pathogenicity has been proven experimentally, this correlation is enhanced in the case of prostate cancer-associated mutations. Our method provides a means for assessing the significance of single nucleotide polymorphisms (SNPs) and cancer-associated mutations.
View details for DOI 10.1093/nar/gng042
View details for Web of Science ID 000182161400002
View details for PubMedID 12682377
The development of high throughput techniques and large-scale studies in the biological sciences has given rise to an explosive growth in both the volume and types of data available to researchers. A surveillance system that monitors data repositories and reports changes helps manage the data overload. We developed a dbSNP surveillance system (URL: http://www.pharmgkb.org/do/serve?id=tools.surveillance.dbsnp) that performs surveillance on the dbSNP database and alerts users to new information. The system is notable because it is personalized and fully automated. Each registered user has a list of genes to follow and receives notification of new entries concerning these genes. The system integrates data from dbSNP, LocusLink, PharmGKB, and Genbank to position SNPs on reference sequences and classify SNPs into categories such as synonymous and non-synonymous SNPs. The system uses data warehousing, object model-based data integration, object-oriented programming, and a platform-neutral data access mechanism.
View details for Web of Science ID 000188997700026
View details for PubMedID 16452787
Osteogenesis imperfecta (OI) is a genetic disease in which the most common mutations result in substitutions for glycine residues in the triple helical domain of the chains of type I collagen. Currently there is no way to use sequence information to predict the clinical OI phenotype. However, structural models coupled with biophysical and machine learning methods may be able to predict sequences that, when mutated, would be associated with more severe forms of OI. To build appropriate structural models, we have applied a high throughput molecular dynamic approach. Homotrimeric peptides covering 57 positions in which mutations are associated with OI were simulated both with and without mutations. Our models revealed structural differences that occur with different substituting amino acids. When mutations were introduced, we observed a decrease in helix stability, as caused by fewer main chain backbone hydrogen bonds, and an increase in main chain root mean square deviation and specifically bound water molecules.
View details for DOI 10.1074/mcp.M200064-MCP200
View details for Web of Science ID 000181516000002
View details for PubMedID 12488462
Researchers have recently questioned the role hydroxylated prolines play in stabilizing the collagen triple helix. To address these issues, we have developed new molecular mechanics parameters for the simulation of peptides containing 4(R)-fluoroproline (Flp), 4(R)-hydroxyproline (Hyp), and 4(R)-aminoproline (Amp). Simulations of peptides based on these parameters can be used to determine the components that stabilize hydroxyproline over proline in the triple helix. The dihedrals F-C-C-N, O-C-C-N, and N-C-C-N were built using a N-beta-ethyl amide model. One nanosecond simulations were performed on the trimers [(Pro-Pro-Gly)(10)](3), [(Pro-Hyp-Gly)(10)](3), [(Pro-Amp-Gly)(10)](3), [(Pro-Amp(1+)-Gly)(10)](3), and [(Pro-Flp-Gly)(10)](3) in explicit solvent. The results of our simulations suggest that pyrrolidine ring conformation is mediated by the strength of the gauche effect and classical electrostatic interactions.
View details for DOI 10.1002/bip.10123
View details for Web of Science ID 000175596100002
View details for PubMedID 11979516
Based on the similarity between the TIGR (trabecular-meshwork inducible glucocorticoid response) (also known as myocilin) and olfactomedin protein families identified throughout the length of the TIGR protein, we have identified more distantly related proteins to determine the elements essential to the function/structure of the TIGR and olfactomedin proteins. Using a sequence walk method and the Shotgun program, we have identified a family including 31 olfactomedin domain-containing sequences. Multiple sequence alignments and secondary structure analyses were used to identify conserved sequence elements. Pairwise identity in the olfactomedin domain ranges from 8 to 64%, with an average pairwise identity of 24%. The N-terminal regions of the proteins fall into two subgroups, one including the TIGR and olfactomedin families and another group of apparently unrelated domains. The TIGR and olfactomedin sequences display conserved motifs including a residual leucine zipper region and maintain a similar secondary structure throughout the N-terminal region. The correlation between conserved elements and disease-associated mutations and apparent polymorphisms in human TIGR was also examined to evaluate the apparent importance of conserved residues to the function/structure of TIGR. Several residues have been identified as essential to the function and/or structure of the human TIGR protein based on their degree of conservation across the family and their implication in the pathogenesis of primary open-angle glaucoma. Additionally, we have identified a group of chitinase sequences containing several of the highly conserved motifs present in the C-terminal region of the olfactomedin domain-containing sequences.
View details for DOI 10.1074/mcp.200023-MCP200
View details for Web of Science ID 000181515400006
View details for PubMedID 12118081
The Pharmacogenetics Knowledge Base (PharmGKB; http://www.pharmgkb.org/) contains genomic, phenotype and clinical information collected from ongoing pharmacogenetic studies. Tools to browse, query, download, submit, edit and process the information are available to registered research network members. A subset of the tools is publicly available. PharmGKB currently contains over 150 genes under study, 14 Coriell populations and a large ontology of pharmacogenetics concepts. The pharmacogenetic concepts and the experimental data are interconnected by a set of relations to form a knowledge base of information for pharmacogenetic researchers. The information in PharmGKB, and its associated tools for processing that information, are tailored for leading-edge pharmacogenetics research. The PharmGKB project was initiated in April 2000 and the first version of the knowledge base went online in February 2001.
View details for Web of Science ID 000173077100041
View details for PubMedID 11752281
For many years, scientists believed that point mutations in genes are the genetic switches for somatic and inherited diseases such as cystic fibrosis, phenylketonuria and cancer. Some of these mutations likely alter a protein's function in a manner that is deleterious, and they should occur in functionally important regions of the protein products of genes. Here we show that disease-associated mutations occur in regions of genes that are conserved, and can identify likely disease-causing mutations.To show this, we have determined conservation patterns for 6185 non-synonymous and heritable disease-associated mutations in 231 genes. We define a parameter, the conservation ratio, as the ratio of average negative entropy of analyzable positions with reported mutations to that of every analyzable position in the gene sequence. We found that 84.0% of the 231 genes have conservation ratios less than one. 139 genes had eleven or more analyzable mutations and 88.0% of those had conservation ratios less than one.These results indicate that phylogenetic information is a powerful tool for the study of disease-associated mutations. Our alignments and analysis has been made available as part of the database at http://cancer.stanford.edu/mut-paper/. Within this dataset, each position is annotated with the analysis, so the most likely disease-causing mutations can be identified.
View details for Web of Science ID 000181476800024
View details for PubMedID 12220483
Research directed toward discovering how genetic factors influence a patient's response to drugs requires coordination of data produced from laboratory experiments, computational methods, and clinical studies. A public repository of pharmacogenetic data should accelerate progress in the field of pharmacogenetics by organizing and disseminating public datasets. We are developing a pharmacogenetics knowledge base (PharmGKB) to support the storage and retrieval of both experimental data and conceptual knowledge. PharmGKB is an Internet-based resource that integrates complex biological, pharmacological, and clinical data in such a way that researchers can submit their data and users can retrieve information to investigate genotype-phenotype correlations. Successful management of the names, meaning, and organization of concepts used within the system is crucial. We have selected a frame-based knowledge-representation system for development of an ontology of concepts and relationships that represent the domain and that permit storage of experimental data. Preliminary experience shows that the ontology we have developed for gene-sequence data allows us to accept, store, and query data submissions.
View details for PubMedID 11928517
Ontologies are useful for organizing large numbers of concepts having complex relationships, such as the breadth of genetic and clinical knowledge in pharmacogenomics. But because ontologies change and knowledge evolves, it is time consuming to maintain stable mappings to external data sources that are in relational format. We propose a method for interfacing ontology models with data acquisition from external relational data sources. This method uses a declarative interface between the ontology and the data source, and this interface is modeled in the ontology and implemented using XML schema. Data is imported from the relational source into the ontology using XML, and data integrity is checked by validating the XML submission with an XML schema. We have implemented this approach in PharmGKB (http://www.pharmgkb.org/), a pharmacogenetics knowledge base. Our goals were to (1) import genetic sequence data, collected in relational format, into the pharmacogenetics ontology, and (2) automate the process of updating the links between the ontology and data acquisition when the ontology changes. We tested our approach by linking PharmGKB with data acquisition from a relational model of genetic sequence information. The ontology subsequently evolved, and we were able to rapidly update our interface with the external data and continue acquiring the data. Similar approaches may be helpful for integrating other heterogeneous information sources in order make the diversity of pharmacogenetics data amenable to computational analysis.
View details for PubMedID 11928521
Pharmacogenomics requires the integration and analysis of genomic, molecular, cellular, and clinical data, and it thus offers a remarkable set of challenges to biomedical informatics. These include infrastructural challenges such as the creation of data models and databases for storing these data, the integration of these data with external databases, the extraction of information from natural language text, and the protection of databases with sensitive information. There are also scientific challenges in creating tools to support gene expression analysis, three-dimensional structural analysis, and comparative genomic analysis. In this review, we summarize the current uses of informatics within pharmacogenomics and show how the technical challenges that remain for biomedical informatics are typical of those that will be confronted in the postgenomic era.
View details for Web of Science ID 000174038800007
View details for PubMedID 11807167
We studied the results of mutating alanine --> glycine at three positions of a collagen-like peptide in an effort to develop a computational method for predicting the energetic and structural effects of a single point genetic mutation in collagen, which is associated with the clinical diagnosis of Osteogenesis Imperfecta (OI). The differences in free energy of denaturation were calculated between the collagen-like peptides [(POG)(4)(POA)(POG)(4)](3) and [(POG)(10)](3) (POG: proline-hydroxyproline-glycine).* Our computational results, which suggest significant destabilization of the collagen-like triple-helix upon the glycine --> alanine mutations, correlate very well with the experimental free energies of denaturation. The robustness of our collagen-like peptide model is shown by its reproduction of experimental results with both different simulation paths and different lengths of the model peptide. The individual free energy for each alanine --> glycine mutation (and the reverse free energy, glycine --> alanine mutation) in the collagen-like peptide has been calculated. We find that the first alanine introduced into the triple helix causes a very large destabilization of the helix, but the last alanine introduced into the same position of an adjacent chain causes a very small change in the peptide stability. Thus, our results demonstrate that each mutation does not contribute equally to the free energy. We find that the sum of the calculated individual residues' free energy can accurately model the experimental free energy for the whole peptide.
View details for Web of Science ID 000166698400012
View details for PubMedID 11169394
Visualization interfaces for high performance computing systems pose special problems due to the complexity and volume of data these systems manipulate. In the post-genomic era, scientists must be able to quickly gain insight into structure-function problems, and require flexible computing environments to quickly create interfaces that link the relevant tools. Feature, a program for analyzing protein sites, takes a set of 3-dimensional structures and creates statistical models of sites of structural or functional significance. Until now, Feature has provided no support for visualization, which can make understanding its results difficult. We have developed an extension to the molecular visualization program Chimera that integrates Feature's statistical models and site predictions with 3-dimensional structures viewed in Chimera. We call this extension ViewFeature, and it is designed to help users understand the structural Features that define a site of interest. We applied ViewFeature in an analysis of the enolase superfamily; a functionally distinct class of proteins that share a common fold, the alpha/beta barrel, in order to gain a more complete understanding of the conserved physical properties of this superfamily. In particular, we wanted to define the structural determinants that distinguish the enolase superfamily active site scaffold from other alpha/beta barrel superfamilies and particularly from other metal-binding alpha/beta barrel proteins. Through the use of ViewFeature, we have found that the C-terminal domain of the enolase superfamily does not differ at the scaffold level from metal-binding alpha/beta barrels. We are, however, able to differentiate between the metal-binding sites of alpha/beta barrels and those of other metal-binding proteins. We describe the overall architectural Features of enolases in a radius of 10 Angstroms around the active site.
View details for PubMedID 11262944
View details for PubMedID 11908751
We have developed new computational methods for displaying and analyzing members of protein superfamilies. These methods (MinRMS, AlignPlot and MSFviewer) integrate sequence and structural information and are implemented as separate but cooperating programs to our Chimera molecular modeling system. Integration of multiple sequence alignment information and three-dimensional structural representations enable researchers to generate hypotheses about the sequence-structure relationship. Structural superpositions can be generated and easily tuned to identify similarities around important characteristics such as active sites or ligand binding sites. Information related to the release of Chimera, MinRMS, AlignPlot and MSFviewer can be obtained at http:¿www.cgl.ucsf.edu/chimera.
View details for PubMedID 10902172
Human growth hormone (hGH) binds to its receptor (hGHr) in a three-body interaction: one molecule of the hormone and two identical monomers of the receptor form a trimer. Curiously, the hormone-receptor interactions in the trimer are not equivalent and the formation of the complex occurs in a specific kinetic order (Cunningham BC, Ultsch M, De Vos AM, Mulkerrin MG, Clauser KR, Wells JA, 1991, Science 254:821-825). In this paper, we model the recognition of hGH to the hGHr using shape complementarity of the three-dimensional structures and macromolecular docking to explore possible binding modes between the receptor and hormone. The method, reported previously (Hendrix DK, Kuntz ID, 1998, Pacific symposium on biocomputing 1998, pp 1234-1244), is based upon matching complementary-shaped strategic sites on the molecular surface. We modify the procedure to examine three-body systems. We find that the order of binding seen experimentally is also essential to our model. We explore the use of mutational data available for hGH to guide our model. In addition to docking hGH to the hGHr, we further test our methodology by successfully reproducing 16 macromolecular complexes from X-ray crystal structures, including enzyme-inhibitor, antibody-antigen, protein dimer, and protein-DNA complexes.
View details for Web of Science ID 000080109000008
View details for PubMedID 10338012
The results of 0.5-1.0 ns molecular dynamics simulations of the collagen-like peptides [(POG)4(POA)(POG)4]3 and [(POG)9]3 (POG: proline-hydroxyproline-glycine) are presented. All simulations were performed using the AMBER-94 molecular mechanical force field with a shell of TIP3P waters surrounding the peptides. The initial geometries for the collagen-like peptides included an x-ray crystallographic structure, a computer-generated structure, a [(POG)9]3 structure modeled from the x-ray structure, and the x-ray structure with crystallographic waters replaced with a shell of modeled TIP3P waters. We examined the molecular dynamics peptide residue rms deviation fluctuations, dihedral angles, molecular and chain end-to-end distances, helical parameters, and peptide-peptide and peptide-solvent hydrogen-bonding patterns. Our molecular dynamics simulations of [(POG)4(POA)(POG)4]3 show average structures and internal coordinates similar to the x-ray crystallographic structure. Our results demonstrate that molecular dynamics can be used to reproduce the experimental structures of collagen-like peptides. We have demonstrated the feasibility of using the AMBER-94 molecular mechanical force field, which was parameterized to model nucleic acids and globular proteins, for fibril proteins. We provide a new interpretation of peptide-solvent hydrogen bonding and a peptide-peptide hydrogen bonding pattern not previously reported in x-ray studies. Last, we report on the differences; in particular with respect to main-chain dihedral angles and hydrogen bonding, between the native and mutant collagen-like peptides.
View details for Web of Science ID 000078270000005
View details for PubMedID 10070265
Quantitative structure-activity relationships (QSAR) have been formulated for a set of 15 2,4-diamino-5-(2-X-benzyl)pyrimidines versus dihydrofolate reductase from Lactobacillus casei and chicken liver. QSARs were also developed for comprehensive data sets containing mono-, di-, and trisubstituted benzyl derivatives. Particular emphasis was placed on the role played by ortho substituents in the overall binding process and subsequent inhibition of the catalytic process in both the prokaryotic and eucaryotic DHFRs. Comparisons between the two QSARs reveal subtle differences at specific positions which can be optimized to design more selective antibacterial agents.
View details for Web of Science ID 000076676100012
View details for PubMedID 9784101
We describe the Object Technology Framework (OTF) software system developed at the University of California, San Francisco Computer Graphics Laboratory for creating C+2 classes that facilitate rapid biomolecular application development and the application of the OTF to collagen modeling. C+2 class libraries for accessing and manipulating data from standard scientific data sources can be generated using the program genlib and its class library toolkit Molecule, thereby facilitating development of new applications. Use of the OTF for generating ideal collagen structural models (gencollagen) is described. The source code for the OTF is freely available at http:/(/)www.cgl.ucsf.edu/off/ to interested application developers.
View details for PubMedID 9697195
The article has the purpose to bring out the experience of having phenomenology as a methodological reference and Martin Heidegger's philosophical thinking expressed in the book entitled ¿Being and Time, used by nursing in order to understanding women who search for the prevention of cervical cancer as well as to analyse teh programs offered to women.
View details for PubMedID 9370761
We present an algorithm for generating images of molecules represented as a set of intersecting opaque spheres. Both perspective and shadows are computed to provide realistic visual cues. Compared to existing programs for generating similar images, our algorithm is both more accurate and several times faster. We present in detail the mathematics used in picture generation, along with examples of the computed images.
View details for Web of Science ID A1991GW91400004
View details for PubMedID 1772848
The role of hydrophobic and electronic effects on the kinetic constants kcat and Km for the papain hydrolysis of a series of 22 substituted N-benzoylglycine pyridyl esters was investigated. The series studied comprises a wide variety of substituents on the N-benzoyl ring, with about a 300,000-fold range in their hydrophobicities, and 2.1-fold range in their electronic Hammet constants (sigma). It was found that the variation in the log kcat and log 1/Km constants could be explained by the following quantitative-structure activity relationships (QSAR): log 1/Km = 0.40 pi 4 + 4.40 and log 1/kcat = 0.45 sigma + 0.18. The substituent constant, pi 4, is the hydrophobic parameter for the 4-N-benzoyl substituents. QSAR analysis of two smaller sets of glycine phenyl and methyl esters produced similar results. A clear separation of the substituent effects indicates that in the case of these particular esters, acylation appears to be the rate limiting catalytic step.
View details for Web of Science ID A1991GD60600007
View details for PubMedID 1888764
The relationship between structure and the Michaelis-Menten constants (Km) for the papain hydrolysis of a series of 37 N-benzoylglycine esters was investigated. The series studied comprises a wide range of aromatic and aliphatic esters with a 5000-fold variation in their Km constants and essentially constant kcat values. It was found that the variation in the Km constants could be rationalized by the following quantitative structure-activity relationship (QSAR): log 1/Km = 8.13F + 0.33Z + 1.27II3' + 1.95. In this equation F is the field inductive parameter, II3' is the hydrophobic constant for the more lipophilic of the two possible meta substituents and Z is the Van der Waals distance from oxygen through the end of the molecule, in the direction of the 4 position of the aromatic ester moiety.
View details for Web of Science ID A1990DB89000004
View details for PubMedID 2331480
We describe a method for generating a molecular surface using a parametric patch representation. Unlike previous methods, this algorithm generates a parametric patch surface which is smooth and G continuous and manipulable in real-time. Crucial to our approach is the creation of a net of approximately equilateral triangles from which we generate the control points used as the basis for describing the surface. We present in detail the method used for generating the triangular net and accompanying control points, along with examples of the resulting surfaces.
View details for Web of Science ID A1990CW84400003
View details for PubMedID 2268622
Quantitative structure-activity relationships (QSAR) have been derived for the action of 68 5-(substituted benzyl)-2,4-diaminopyrimidines on dihydrofolate reductase (DHFR) from Lactobacillus casei and chicken liver. The QSAR are analyzed with respect to the stereographics models of the active sites of the enzymes and found to be in good agreement. Using these QSAR equations, we have attempted to design new trimethoprim-type antifolates having higher selectivity for the bacterial enzyme. The general problem of developing selective inhibitors is discussed.
View details for Web of Science ID A1989AH18700035
View details for PubMedID 2502631
View details for Web of Science ID A1989U961300001
The hydrolysis of 30 substituted phenyl hippurates (X-C6H4OCOCH2NHCOC6H5) by subtilisin BPN' was studied and from the results the following quantitative structure-activity relationship was derived: log 1/Km = 0.39 sigma + 0.16 B5.4 + 0.29 pi'3 + 3.58. In this expression Km is the Michaelis constant, sigma is the Hammett constant, B5.4 is the sterimol steric parameter of X in the 4-position and pi'3 is the hydrophobic parameter for the more hydrophobic of the two possible meta substituents. The other meta substitutent is assigned a pi value of 0. This mathematical model is qualitatively compared with a molecular graphics model constructed from the X-ray crystallographic coordinates of subtilisin BPN'. The results with subtilisin BPN' are compared with our earlier study of similar substrates with Carlsberg subtilisin.
View details for Web of Science ID A1988R186900001
View details for PubMedID 3056624
Poretz and Goldstein showed that X-phenyl beta-D-glucopyranosides prevent the agglutination of concanavalin A with polysaccharides and derived inhibition constants for the process. Using their data the binding of 25 glucosides to concanavalin is now shown to be correlated with the molar refractivity of the substituents on the phenyl ring. This is interpreted to mean that it is the bulk of the substituents and not their hydrophobicity which prevents the union of concanavalin and the polysaccharide. These results are similar to those found for other haptens preventing antibody-antigen interaction.
View details for Web of Science ID A1987M184000002
View details for PubMedID 3449391
The hydrolysis of a set of 28 X-phenyl hippurates by chymotrypsin was investigated. From the derived Km and kcat values a quantitative structure-activity relationship was developed. This equation shows that para substituents correlated by sigma- display only an electronic effect on the formation of the ES complex whereas meta hydrophobic substituents show a hydrophobic interaction correlated by pi in addition to their electronic effect. Meta polar substituents avoid contact with the enzyme and show only electronic effects on Km. Using the x-ray crystallographic coordinates for chymotrypsin and computer graphics, a model was constructed which is used to interpret the quantitative structure-activity relationship. As with a number of previously reported examples, we have found that when polar substituents have the option of binding to hydrophobic space or remaining in the aqueous phase they follow the latter possibility.
View details for Web of Science ID A1987J441600057
View details for PubMedID 3611088
N alpha-(4-Amino-4-deoxy-10-methylpteroyl)-N epsilon-(4-azido-5- [125I]iodosalicylyl)-L-lysine, a photoaffinity analogue of methotrexate, is only 2-fold less potent than methotrexate in the inhibition of murine L1210 dihydrofolate reductase. Irradiation of the enzyme in the presence of an equimolar concentration of the 125I-labeled analogue ultimately leads to an 8% incorporation of the photoprobe. A 100-fold molar excess of methotrexate essentially blocks this incorporation. Cyanogen bromide digestion of the labeled enzyme, followed by high-pressure liquid chromatography purification of the generated peptides, indicates that greater than 85% of the total radioactivity is incorporated into a single cyanogen bromide peptide. Sequence analysis revealed this peptide to be residues 53-111, with a majority of the radioactivity centered around residues 63-65 (Lys-Asn-Arg). These data demonstrate that the photoaffinity analogue specifically binds to dihydrofolate reductase and covalently modifies the enzyme following irradiation and is therefore a photolabeling agent useful for probing the inhibitor binding domain of the enzyme.
View details for Web of Science ID A1987J480800023
View details for PubMedID 3663623
View details for Web of Science ID 000314138700072
View details for Web of Science ID A1990BT39X00020