Bio

Academic Appointments


Honors & Awards


  • Career Award at the Scientific Interface, Burroughs-Wellcome Foundation (2008-present)

Professional Education


  • Ph.D., Stanford University, Physics (2005)
  • M.Res., University College London, Biocomplexity (2000)
  • M.Phil., Cambridge University, Physics (Radio Astronomy) (1999)
  • A.B.,s.c.l., Harvard University, Physics (1998)

Research & Scholarship

Current Research and Scholarly Interests


We strive for a predictive understanding of how biopolymer sequences code for biopolymer structures, with an initial focus on RNA.

Our research is following three tracks:

First, we are exploring new ab initio algorithms to predict the structures and energetics of RNAs and proteins at high resolution, with an initial focus on the smallest such puzzles. We test and apply these ideas through community-wide blind trials; by fixing crystallographic models; and by solving structures with sparse chemical mapping and NMR data.

Second, we are developing information-rich biochemical methods to solve the myriad structures of noncoding RNAs that remain unknown. Current efforts focus on applying these experimental methods to basic mysteries in RNA behavior, including the extent of RNA structure inside cells and viruses.

Third, we are integrating high-throughput biochemistry with a 100,000-player on-line game called Eterna. This project is revealing missing rules in RNA folding and design and engineering RNA devices for cellular control and computing. As the first instantiation of 'cloud biochemistry', Eterna empowers expert and citizen scientists to collaboratively solve fundamental biochemical problems on-line with rapid experimental certification.

Overall, our work aims to bring us a future in which coding living systems with RNA is as agile and pervasive as coding conventional computers with programming languages.

Teaching

2013-14 Courses


Postdoctoral Advisees


Graduate and Fellowship Programs


Publications

Journal Articles


  • RNA design rules from a massive open laboratory PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Lee, J., Kladwang, W., Lee, M., Cantu, D., Azizyan, M., Kim, H., Limpaecher, A., Yoon, S., Treuille, A., Das, R. 2014; 111 (6): 2122-2127

    Abstract

    Self-assembling RNA molecules present compelling substrates for the rational interrogation and control of living systems. However, imperfect in silico models--even at the secondary structure level--hinder the design of new RNAs that function properly when synthesized. Here, we present a unique and potentially general approach to such empirical problems: the Massive Open Laboratory. The EteRNA project connects 37,000 enthusiasts to RNA design puzzles through an online interface. Uniquely, EteRNA participants not only manipulate simulated molecules but also control a remote experimental pipeline for high-throughput RNA synthesis and structure mapping. We show herein that the EteRNA community leveraged dozens of cycles of continuous wet laboratory feedback to learn strategies for solving in vitro RNA design problems on which automated methods fail. The top strategies--including several previously unrecognized negative design rules--were distilled by machine learning into an algorithm, EteRNABot. Over a rigorous 1-y testing phase, both the EteRNA community and EteRNABot significantly outperformed prior algorithms in a dozen RNA secondary structure design tests, including the creation of dendrimer-like structures and scaffolds for small molecule sensors. These results show that an online community can carry out large-scale experiments, hypothesis generation, and algorithm design to create practical advances in empirical science.

    View details for DOI 10.1073/pnas.1313039111

    View details for Web of Science ID 000330999600027

    View details for PubMedID 24469816

  • Challenging the state of the art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10 PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS Kryshtafovych, A., Moult, J., Bales, P., Bazan, J. F., Biasini, M., Burgin, A., Chen, C., Cochran, F. V., Craig, T. K., Das, R., Fass, D., Garcia-Doval, C., Herzberg, O., Lorimer, D., Luecke, H., Ma, X., Nelson, D. C., Van Raaij, M. J., Rohwer, F., Segall, A., Seguritan, V., Zeth, K., Schwede, T. 2014; 82: 26-42

    Abstract

    For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, more than 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this article, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict transmembrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin (IL)-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fiber protein gene product 17 from bacteriophage T7; the bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally, an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins.

    View details for Web of Science ID 000331147900004

    View details for PubMedID 24318984

  • The Mutate-and-Map Protocol for Inferring Base Pairs in Structured RNA. Methods in molecular biology (Clifton, N.J.) Cordero, P., Kladwang, W., VanLang, C. C., Das, R. 2014; 1086: 53-77

    Abstract

    Chemical mapping is a widespread technique for structural analysis of nucleic acids in which a molecule's reactivity to different probes is quantified at single nucleotide resolution and used to constrain structural modeling. This experimental framework has been extensively revisited in the past decade with new strategies for high-throughput readouts, chemical modification, and rapid data analysis. Recently, we have coupled the technique to high-throughput mutagenesis. Point mutations of a base paired nucleotide can lead to exposure of not only that nucleotide but also its interaction partner. Systematically carrying out the mutation and mapping for the entire system gives an experimental approximation of the molecule's "contact map." Here, we give our in-house protocol for this "mutate-and-map" (M2) strategy, based on 96-well capillary electrophoresis, and we provide practical tips on interpreting the data to infer nucleic acid structure.

    View details for DOI 10.1007/978-1-62703-667-2_4

    View details for PubMedID 24136598

  • Massively Parallel RNA Chemical Mapping with a Reduced Bias MAP-Seq Protocol. Methods in molecular biology (Clifton, N.J.) Seetin, M. G., Kladwang, W., Bida, J. P., Das, R. 2014; 1086: 95-117

    Abstract

    Chemical mapping methods probe RNA structure by revealing and leveraging correlations of a nucleotide's structural accessibility or flexibility with its reactivity to various chemical probes. Pioneering work by Lucks and colleagues has expanded this method to probe hundreds of molecules at once on an Illumina sequencing platform, obviating the use of slab gels or capillary electrophoresis on one molecule at a time. Here, we describe optimizations to this method from our lab, resulting in the MAP-seq protocol (Multiplexed Accessibility Probing read out through sequencing), version 1.0. The protocol permits the quantitative probing of thousands of RNAs at once, by several chemical modification reagents, on the time scale of a day using a tabletop Illumina machine. This method and a software package MAPseeker ( http://simtk.org/home/map_seeker ) address several potential sources of bias, by eliminating PCR steps, improving ligation efficiencies of ssDNA adapters, and avoiding problematic heuristics in prior algorithms. We hope that the step-by-step description of MAP-seq 1.0 will help other RNA mapping laboratories to transition from electrophoretic to next-generation sequencing methods and to further reduce the turnaround time and any remaining biases of the protocol.

    View details for DOI 10.1007/978-1-62703-667-2_6

    View details for PubMedID 24136600

  • Atomic-Accuracy Prediction of Protein Loop Structures through an RNA-Inspired Ansatz PLOS ONE Das, R. 2013; 8 (10)

    Abstract

    Consistently predicting biopolymer structure at atomic resolution from sequence alone remains a difficult problem, even for small sub-segments of large proteins. Such loop prediction challenges, which arise frequently in comparative modeling and protein design, can become intractable as loop lengths exceed 10 residues and if surrounding side-chain conformations are erased. Current approaches, such as the protein local optimization protocol or kinematic inversion closure (KIC) Monte Carlo, involve stages that coarse-grain proteins, simplifying modeling but precluding a systematic search of all-atom configurations. This article introduces an alternative modeling strategy based on a 'stepwise ansatz', recently developed for RNA modeling, which posits that any realistic all-atom molecular conformation can be built up by residue-by-residue stepwise enumeration. When harnessed to a dynamic-programming-like recursion in the Rosetta framework, the resulting stepwise assembly (SWA) protocol enables enumerative sampling of a 12 residue loop at a significant but achievable cost of thousands of CPU-hours. In a previously established benchmark, SWA recovers crystallographic conformations with sub-Angstrom accuracy for 19 of 20 loops, compared to 14 of 20 by KIC modeling with a comparable expenditure of computational power. Furthermore, SWA gives high accuracy results on an additional set of 15 loops highlighted in the biological literature for their irregularity or unusual length. Successes include cis-Pro touch turns, loops that pass through tunnels of other side-chains, and loops of lengths up to 24 residues. Remaining problem cases are traced to inaccuracies in the Rosetta all-atom energy function. In five additional blind tests, SWA achieves sub-Angstrom accuracy models, including the first such success in a protein/RNA binding interface, the YbxF/kink-turn interaction in the fourth 'RNA-puzzle' competition. These results establish all-atom enumeration as an unusually systematic approach to ab initio protein structure modeling that can leverage high performance computing and physically realistic energy functions to more consistently achieve atomic accuracy.

    View details for DOI 10.1371/journal.pone.0074830

    View details for Web of Science ID 000326032600003

    View details for PubMedID 24204571

  • Adding Diverse Noncanonical Backbones to Rosetta: Enabling Peptidomimetic Design PLOS ONE Drew, K., Renfrew, P. D., Craven, T. W., Butterfoss, G. L., Chou, F., Lyskov, S., Bullock, B. N., Watkins, A., Labonte, J. W., Pacella, M., Kilambi, K. P., Leaver-Fay, A., Kuhlman, B., Gray, J. J., Bradley, P., Kirshenbaum, K., Arora, P. S., Das, R., Bonneau, R. 2013; 8 (7)

    Abstract

    Peptidomimetics are classes of molecules that mimic structural and functional attributes of polypeptides. Peptidomimetic oligomers can frequently be synthesized using efficient solid phase synthesis procedures similar to peptide synthesis. Conformationally ordered peptidomimetic oligomers are finding broad applications for molecular recognition and for inhibiting protein-protein interactions. One critical limitation is the limited set of design tools for identifying oligomer sequences that can adopt desired conformations. Here, we present expansions to the ROSETTA platform that enable structure prediction and design of five non-peptidic oligomer scaffolds (noncanonical backbones), oligooxopiperazines, oligo-peptoids, [Formula: see text]-peptides, hydrogen bond surrogate helices and oligosaccharides. This work is complementary to prior additions to model noncanonical protein side chains in ROSETTA. The main purpose of our manuscript is to give a detailed description to current and future developers of how each of these noncanonical backbones was implemented. Furthermore, we provide a general outline for implementation of new backbone types not discussed here. To illustrate the utility of this approach, we describe the first tests of the ROSETTA molecular mechanics energy function in the context of oligooxopiperazines, using quantum mechanical calculations as comparison points, scanning through backbone and side chain torsion angles for a model peptidomimetic. Finally, as an example of a novel design application, we describe the automated design of an oligooxopiperazine that inhibits the p53-MDM2 protein-protein interaction. For the general biological and bioengineering community, several noncanonical backbones have been incorporated into web applications that allow users to freely and rapidly test the presented protocols (http://rosie.rosettacommons.org). This work helps address the peptidomimetic community's need for an automated and expandable modeling tool for noncanonical backbones.

    View details for DOI 10.1371/journal.pone.0067051

    View details for Web of Science ID 000323110600005

    View details for PubMedID 23869206

  • HiTRACE-Web: an online tool for robust analysis of high-throughput capillary electrophoresis NUCLEIC ACIDS RESEARCH Kim, H., Cordero, P., Das, R., Yoon, S. 2013; 41 (W1): W492-W498

    View details for DOI 10.1093/nar/gkt501

    View details for Web of Science ID 000323603200079

  • Serverification of Molecular Modeling Applications: The Rosetta Online Server That Includes Everyone (ROSIE) PLOS ONE Lyskov, S., Chou, F., Conchuir, S. O., Der, B. S., Drew, K., Kuroda, D., Xu, J., Weitzner, B. D., Renfrew, P. D., Sripakdeevong, P., Borgo, B., Havranek, J. J., Kuhlman, B., Kortemme, T., Bonneau, R., Gray, J. J., Das, R. 2013; 8 (5)

    Abstract

    The Rosetta molecular modeling software package provides experimentally tested and rapidly evolving tools for the 3D structure prediction and high-resolution design of proteins, nucleic acids, and a growing number of non-natural polymers. Despite its free availability to academic users and improving documentation, use of Rosetta has largely remained confined to developers and their immediate collaborators due to the code's difficulty of use, the requirement for large computational resources, and the unavailability of servers for most of the Rosetta applications. Here, we present a unified web framework for Rosetta applications called ROSIE (Rosetta Online Server that Includes Everyone). ROSIE provides (a) a common user interface for Rosetta protocols, (b) a stable application programming interface for developers to add additional protocols, (c) a flexible back-end to allow leveraging of computer cluster resources shared by RosettaCommons member institutions, and (d) centralized administration by the RosettaCommons to ensure continuous maintenance. This paper describes the ROSIE server infrastructure, a step-by-step 'serverification' protocol for use by Rosetta developers, and the deployment of the first nine ROSIE applications by six separate developer teams: Docking, RNA de novo, ERRASER, Antibody, Sequence Tolerance, Supercharge, Beta peptide design, NCBB design, and VIP redesign. As illustrated by the number and diversity of these applications, ROSIE offers a general and speedy paradigm for serverification of Rosetta applications that incurs negligible cost to developers and lowers barriers to Rosetta use for the broader biological community. ROSIE is available at http://rosie.rosettacommons.org.

    View details for DOI 10.1371/journal.pone.0063906

    View details for Web of Science ID 000320362700078

    View details for PubMedID 23717507

  • Remodeling a beta-peptide bundle CHEMICAL SCIENCE Molski, M. A., Goodman, J. L., Chou, F., Baker, D., Das, R., Schepartz, A. 2013; 4 (1): 319-324

    View details for DOI 10.1039/c2sc21117c

    View details for Web of Science ID 000311971500036

  • Correcting pervasive errors in RNA crystallography through enumerative structure prediction NATURE METHODS Chou, F., Sripakdeevong, P., Dibrov, S. M., Hermann, T., Das, R. 2013; 10 (1): 74-U105

    Abstract

    Three-dimensional RNA models fitted into crystallographic density maps exhibit pervasive conformational ambiguities, geometric errors and steric clashes. To address these problems, we present enumerative real-space refinement assisted by electron density under Rosetta (ERRASER), coupled to Python-based hierarchical environment for integrated 'xtallography' (PHENIX) diffraction-based refinement. On 24 data sets, ERRASER automatically corrects the majority of MolProbity-assessed errors, improves the average R(free) factor, resolves functionally important discrepancies in noncanonical structure and refines low-resolution models to better match higher-resolution models.

    View details for DOI 10.1038/NMETH.2262

    View details for Web of Science ID 000312810100041

    View details for PubMedID 23202432

  • Advances, Interactions, and Future Developments in the CNS, Phenix, and Rosetta Structural Biology Software Systems ANNUAL REVIEW OF BIOPHYSICS, VOL 42 Adams, P. D., Baker, D., Brunger, A. T., Das, R., DiMaio, F., Read, R. J., Richardson, D. C., Richardson, J. S., Terwilliger, T. C. 2013; 42: 265-287

    Abstract

    Advances in our understanding of macromolecular structure come from experimental methods, such as X-ray crystallography, and also computational analysis of the growing number of atomic models obtained from such experiments. The later analyses have made it possible to develop powerful tools for structure prediction and optimization in the absence of experimental data. In recent years, a synergy between these computational methods for crystallographic structure determination and structure prediction and optimization has begun to be exploited. We review some of the advances in the algorithms used for crystallographic structure determination in the Phenix and Crystallography & NMR System software packages and describe how methods from ab initio structure prediction and refinement in Rosetta have been applied to challenging crystallographic problems. The prospects for future improvement of these methods are discussed.

    View details for DOI 10.1146/annurev-biophys-083012-130253

    View details for Web of Science ID 000321695700013

    View details for PubMedID 23451892

  • An RNA Mapping DataBase for curating RNA structure mapping experiments BIOINFORMATICS Cordero, P., Lucks, J. B., Das, R. 2012; 28 (22): 3006-3008

    Abstract

    We have established an RNA mapping database (RMDB) to enable structural, thermodynamic and kinetic comparisons across single-nucleotide-resolution RNA structure mapping experiments. The volume of structure mapping data has greatly increased since the development of high-throughput sequencing techniques, accelerated software pipelines and large-scale mutagenesis. For scientists wishing to infer relationships between RNA sequence/structure and these mapping data, there is a need for a database that is curated, tagged with error estimates and interfaced with tools for sharing, visualization, search and meta-analysis. Through its on-line front-end, the RMDB allows users to explore single-nucleotide-resolution mapping data in heat-map, bar-graph and colored secondary structure graphics; to leverage these data to generate secondary structure hypotheses; and to download the data in standardized and computer-friendly files, including the RDAT and community-consensus SNRNASM formats. At the time of writing, the database houses 53 entries, describing more than 2848 experiments of 1098 RNA constructs in several solution conditions and is growing rapidly.Freely available on the web at http://rmdb.stanford.edu.rhiju@stanford.edu.Supplementary data are available at Bioinformatics Online.

    View details for DOI 10.1093/bioinformatics/bts554

    View details for Web of Science ID 000311303500028

    View details for PubMedID 22976082

  • Quantitative Dimethyl Sulfate Mapping for Automated RNA Secondary Structure Inference BIOCHEMISTRY Cordero, P., Kladwang, W., VanLang, C. C., Das, R. 2012; 51 (36): 7037-7039

    Abstract

    For decades, dimethyl sulfate (DMS) mapping has informed manual modeling of RNA structure in vitro and in vivo. Here, we incorporate DMS data into automated secondary structure inference using an energy minimization framework developed for 2'-OH acylation (SHAPE) mapping. On six noncoding RNAs with crystallographic models, DMS-guided modeling achieves overall false negative and false discovery rates of 9.5% and 11.6%, respectively, comparable to or better than those of SHAPE-guided modeling, and bootstrapping provides straightforward confidence estimates. Integrating DMS-SHAPE data and including 1-cyclohexyl(2-morpholinoethyl) carbodiimide metho-p-toluene sulfonate (CMCT) reactivities provide small additional improvements. These results establish DMS mapping, an already routine technique, as a quantitative tool for unbiased RNA secondary structure modeling.

    View details for DOI 10.1021/bi3008802

    View details for Web of Science ID 000308833500001

    View details for PubMedID 22913637

  • Squaring theory with practice in RNA design CURRENT OPINION IN STRUCTURAL BIOLOGY Bida, J. P., Das, R. 2012; 22 (4): 457-466

    Abstract

    Ribonucleic acid (RNA) design offers unique opportunities for engineering genetic networks and nanostructures that self-assemble within living cells. Recent years have seen the creation of increasingly complex RNA devices, including proof-of-concept applications for in vivo three-dimensional scaffolding, imaging, computing, and control of biological behaviors. Expert intuition and simple design rules--the stability of double helices, the modularity of noncanonical RNA motifs, and geometric closure--have enabled these successful applications. Going beyond heuristics, emerging algorithms may enable automated design of RNAs with nucleotide-level accuracy but, as illustrated on a recent RNA square design, are not yet fully predictive. Looking ahead, technological advances in RNA synthesis and interrogation are poised to radically accelerate the discovery and stringent testing of design methods.

    View details for DOI 10.1016/j.sbi.2012.06.003

    View details for Web of Science ID 000308516800009

    View details for PubMedID 22832174

  • Ultraviolet Shadowing of RNA Can Cause Significant Chemical Damage in Seconds SCIENTIFIC REPORTS Kladwang, W., Hum, J., Das, R. 2012; 2

    Abstract

    Chemical purity of RNA samples is important for high-precision studies of RNA folding and catalytic behavior, but photodamage accrued during ultraviolet (UV) shadowing steps of sample preparation can reduce this purity. Here, we report the quantitation of UV-induced damage by using reverse transcription and single-nucleotide-resolution capillary electrophoresis. We found photolesions in a dozen natural and artificial RNAs; across multiple sequence contexts, dominantly at but not limited to pyrimidine doublets; and from multiple lamps recommended for UV shadowing. Irradiation time-courses revealed detectable damage within a few seconds of exposure for 254 nm lamps held at a distance of 5 to 10 cm from 0.5-mm thickness gels. Under these conditions, 200-nucleotide RNAs subjected to 20 seconds of UV shadowing incurred damage to 16-27% of molecules; and, due to a 'skin effect', the molecule-by-molecule distribution of lesions gave 4-fold higher variance than a Poisson distribution. Thicker gels, longer wavelength lamps, and shorter exposure times reduced but did not eliminate damage. These results suggest that RNA biophysical studies should report precautions taken to avoid artifactual heterogeneity from UV shadowing.

    View details for DOI 10.1038/srep00517

    View details for Web of Science ID 000306707600001

    View details for PubMedID 22816040

  • Metal-ion rescue revisited: Biochemical detection of site-bound metal ions important for RNA folding RNA-A PUBLICATION OF THE RNA SOCIETY Frederiksen, J. K., Li, N., Das, R., Herschlag, D., Piccirilli, J. A. 2012; 18 (6): 1123-1141

    Abstract

    Within the three-dimensional architectures of RNA molecules, divalent metal ions populate specific locations, shedding their water molecules to form chelates. These interactions help the RNA adopt and maintain specific conformations and frequently make essential contributions to function. Defining the locations of these site-bound metal ions remains challenging despite the growing database of RNA structures. Metal-ion rescue experiments have provided a powerful approach to identify and distinguish catalytic metal ions within RNA active sites, but the ability of such experiments to identify metal ions that contribute to tertiary structure acquisition and structural stability is less developed and has been challenged. Herein, we use the well-defined P4-P6 RNA domain of the Tetrahymena group I intron to reevaluate prior evidence against the discriminatory power of metal-ion rescue experiments and to advance thermodynamic descriptions necessary for interpreting these experiments. The approach successfully identifies ligands within the RNA that occupy the inner coordination sphere of divalent metal ions and distinguishes them from ligands that occupy the outer coordination sphere. Our results underscore the importance of obtaining complete folding isotherms and establishing and evaluating thermodynamic models in order to draw conclusions from metal-ion rescue experiments. These results establish metal-ion rescue as a rigorous tool for identifying and dissecting energetically important metal-ion interactions in RNAs that are noncatalytic but critical for RNA tertiary structure.

    View details for DOI 10.1261/rna.028738.111

    View details for Web of Science ID 000304423000003

    View details for PubMedID 22539523

  • RNA-Puzzles: A CASP-like evaluation of RNA three-dimensional structure prediction RNA-A PUBLICATION OF THE RNA SOCIETY Cruz, J. A., Blanchet, M., Boniecki, M., Bujnicki, J. M., Chen, S., Cao, S., Das, R., Ding, F., Dokholyan, N. V., Flores, S. C., Huang, L., Lavender, C. A., Lisi, V., Major, F., Mikolajczak, K., Patel, D. J., Philips, A., Puton, T., Santalucia, J., Sijenyi, F., Hermann, T., Rother, K., Rother, M., Serganov, A., Skorupski, M., Soltysinski, T., Sripakdeevong, P., Tuszynska, I., Weeks, K. M., Waldsich, C., Wildauer, M., Leontis, N. B., Westhof, E. 2012; 18 (4): 610-625

    Abstract

    We report the results of a first, collective, blind experiment in RNA three-dimensional (3D) structure prediction, encompassing three prediction puzzles. The goals are to assess the leading edge of RNA structure prediction techniques; compare existing methods and tools; and evaluate their relative strengths, weaknesses, and limitations in terms of sequence length and structural complexity. The results should give potential users insight into the suitability of available methods for different applications and facilitate efforts in the RNA structure prediction community in ongoing efforts to improve prediction tools. We also report the creation of an automated evaluation pipeline to facilitate the analysis of future RNA structure prediction exercises.

    View details for DOI 10.1261/rna.031054.111

    View details for Web of Science ID 000301954600002

    View details for PubMedID 22361291

  • Automated RNA Structure Prediction Uncovers a Kink-Turn Linker in Double Glycine Riboswitches JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Kladwang, W., Chou, F., Das, R. 2012; 134 (3): 1404-1407

    Abstract

    The tertiary structures of functional RNA molecules remain difficult to decipher. A new generation of automated RNA structure prediction methods may help address these challenges but have not yet been experimentally validated. Here we apply four prediction tools to a class of double glycine riboswitches that can bind two ligands cooperatively. A novel method (BPPalign), RMdetect, JAR3D, and Rosetta 3D modeling give consistent predictions for a new stem P0 and a kink-turn motif. These elements structure the linker between the RNAs' double aptamers. Chemical mapping on the Fusobacterium nucleatum riboswitch with N-methylisatoic anhydride, dimethyl sulfate and 1-cyclohexyl-3-(2-morpholinoethyl)carbodiimide metho-p-toluenesulfonate probing, mutate-and-map studies, and mutation/rescue experiments all provide strong evidence for the structured linker. Under solution conditions that permit rigorous thermodynamic analysis, disrupting this helix-junction-helix structure gives 120- and 6-30-fold poorer dissociation constants for the RNA's two glycine-binding transitions, corresponding to an overall energetic impact of 4.3 ± 0.5 kcal/mol. Prior biochemical and crystallography studies did not include this critical element due to over-truncation of the RNA. We speculate that several further undiscovered elements are likely to exist in the flanking regions of this and other functional RNAs, and automated prediction tools can play a useful role in their detection and dissection.

    View details for DOI 10.1021/ja2093508

    View details for Web of Science ID 000301084400005

    View details for PubMedID 22192063

  • An enumerative stepwise ansatz enables atomic-accuracy RNA loop modeling PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Sripakdeevong, P., Kladwang, W., Das, R. 2011; 108 (51): 20573-20578

    Abstract

    Atomic-accuracy structure prediction of macromolecules should be achievable by optimizing a physically realistic energy function but is presently precluded by incomplete sampling of a biopolymer's many degrees of freedom. We present herein a working hypothesis, called the "stepwise ansatz," for recursively constructing well-packed atomic-detail models in small steps, enumerating several million conformations for each monomer, and covering all build-up paths. By making use of high-performance computing and the Rosetta framework, we provide first tests of this hypothesis on a benchmark of 15 RNA loop-modeling problems drawn from riboswitches, ribozymes, and the ribosome, including 10 cases that are not solvable by current knowledge-based modeling approaches. For each loop problem, this deterministic stepwise assembly method either reaches atomic accuracy or exposes flaws in Rosetta's all-atom energy function, indicating the resolution of the conformational sampling bottleneck. As a further rigorous test, we have carried out a blind all-atom prediction for a noncanonical RNA motif, the C7.2 tetraloop/receptor, and validated this model through nucleotide-resolution chemical mapping experiments. Stepwise assembly is an enumerative, ab initio build-up method that systematically outperforms existing Monte Carlo and knowledge-based methods for 3D structure prediction.

    View details for DOI 10.1073/pnas.1106516108

    View details for Web of Science ID 000298289400065

    View details for PubMedID 22143768

  • A two-dimensional mutate-and-map strategy for non-coding RNA structure NATURE CHEMISTRY Kladwang, W., VanLang, C. C., Cordero, P., Das, R. 2011; 3 (12): 954-962

    Abstract

    Non-coding RNAs fold into precise base-pairing patterns to carry out critical roles in genetic regulation and protein synthesis, but determining RNA structure remains difficult. Here, we show that coupling systematic mutagenesis with high-throughput chemical mapping enables accurate base-pair inference of domains from ribosomal RNA, ribozymes and riboswitches. For a six-RNA benchmark that has challenged previous chemical/computational methods, this 'mutate-and-map' strategy gives secondary structures that are in agreement with crystallography (helix error rates, 2%), including a blind test on a double-glycine riboswitch. Through modelling of partially ordered states, the method enables the first test of an interdomain helix-swap hypothesis for ligand-binding cooperativity in a glycine riboswitch. Finally, the data report on tertiary contacts within non-coding RNAs, and coupling to the Rosetta/FARFAR algorithm gives nucleotide-resolution three-dimensional models (helix root-mean-squared deviation, 5.7 Å) of an adenine riboswitch. These results establish a promising two-dimensional chemical strategy for inferring the secondary and tertiary structures that underlie non-coding RNA behaviour.

    View details for DOI 10.1038/NCHEM.1176

    View details for Web of Science ID 000297685800014

    View details for PubMedID 22109276

  • Understanding the Errors of SHAPE-Directed RNA Structure Modeling BIOCHEMISTRY Kladwang, W., VanLang, C. C., Cordero, P., Das, R. 2011; 50 (37): 8049-8056

    Abstract

    Single-nucleotide-resolution chemical mapping for structured RNA is being rapidly advanced by new chemistries, faster readouts, and coupling to computational algorithms. Recent tests have shown that selective 2'-hydroxyl acylation by primer extension (SHAPE) can give near-zero error rates (0-2%) in modeling the helices of RNA secondary structure. Here, we benchmark the method using six molecules for which crystallographic data are available: tRNA(phe) and 5S rRNA from Escherichia coli, the P4-P6 domain of the Tetrahymena group I ribozyme, and ligand-bound domains from riboswitches for adenine, cyclic di-GMP, and glycine. SHAPE-directed modeling of these highly structured RNAs gave an overall false negative rate (FNR) of 17% and a false discovery rate (FDR) of 21%, with at least one helix prediction error in five of the six cases. Extensive variations of data processing, normalization, and modeling parameters did not significantly mitigate modeling errors. Only one varation, filtering out data collected with deoxyinosine triphosphate during primer extension, gave a modest improvement (FNR = 12%, and FDR = 14%). The residual structure modeling errors are explained by the insufficient information content of these RNAs' SHAPE data, as evaluated by a nonparametric bootstrapping analysis. Beyond these benchmark cases, bootstrapping suggests a low level of confidence (<50%) in the majority of helices in a previously proposed SHAPE-directed model for the HIV-1 RNA genome. Thus, SHAPE-directed RNA modeling is not always unambiguous, and helix-by-helix confidence estimates, as described herein, may be critical for interpreting results from this powerful methodology.

    View details for DOI 10.1021/bi4200524n

    View details for Web of Science ID 000294791100021

    View details for PubMedID 21842868

  • Quantitative comparison of villin headpiece subdomain simulations and triplet-triplet energy transfer experiments PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Beauchamp, K. A., Ensign, D. L., Das, R., Pande, V. S. 2011; 108 (31): 12734-12739

    Abstract

    As the fastest folding protein, the villin headpiece (HP35) serves as an important bridge between simulation and experimental studies of protein folding. Despite the simplicity of this system, experiments continue to reveal a number of surprises, including structure in the unfolded state and complex equilibrium dynamics near the native state. Using 2.5 ms of molecular dynamics and Markov state models, we connect to current experimental results in three ways. First, we present and validate a novel method for the quantitative prediction of triplet-triplet energy transfer experiments. Second, we construct a many-state model for HP35 that is consistent with previous experiments. Finally, we predict contact-formation time traces for all 1,225 possible triplet-triplet energy transfer experiments on HP35.

    View details for DOI 10.1073/pnas.1010880108

    View details for Web of Science ID 000293385700043

    View details for PubMedID 21768345

  • HiTRACE: high-throughput robust analysis for capillary electrophoresis BIOINFORMATICS Yoon, S., Kim, J., Hum, J., Kim, H., Park, S., Kladwang, W., Das, R. 2011; 27 (13): 1798-1805

    Abstract

    Capillary electrophoresis (CE) of nucleic acids is a workhorse technology underlying high-throughput genome analysis and large-scale chemical mapping for nucleic acid structural inference. Despite the wide availability of CE-based instruments, there remain challenges in leveraging their full power for quantitative analysis of RNA and DNA structure, thermodynamics and kinetics. In particular, the slow rate and poor automation of available analysis tools have bottlenecked a new generation of studies involving hundreds of CE profiles per experiment.We propose a computational method called high-throughput robust analysis for capillary electrophoresis (HiTRACE) to automate the key tasks in large-scale nucleic acid CE analysis, including the profile alignment that has heretofore been a rate-limiting step in the highest throughput experiments. We illustrate the application of HiTRACE on 13 datasets representing 4 different RNAs, 3 chemical modification strategies and up to 480 single mutant variants; the largest datasets each include 87 360 bands. By applying a series of robust dynamic programming algorithms, HiTRACE outperforms prior tools in terms of alignment and fitting quality, as assessed by measures including the correlation between quantified band intensities between replicate datasets. Furthermore, while the smallest of these datasets required 7-10 h of manual intervention using prior approaches, HiTRACE quantitation of even the largest datasets herein was achieved in 3-12 min. The HiTRACE method, therefore, resolves a critical barrier to the efficient and accurate analysis of nucleic acid structure in experiments involving tens of thousands of electrophoretic bands.

    View details for DOI 10.1093/bioinformatics/btr277

    View details for Web of Science ID 000291752600058

    View details for PubMedID 21561922

  • Sharing and archiving nucleic acid structure mapping data RNA-A PUBLICATION OF THE RNA SOCIETY Rocca-Serra, P., Bellaousov, S., Birmingham, A., Chen, C., Cordero, P., Das, R., Davis-Neulander, L., Duncan, C. D., Halvorsen, M., Knight, R., Leontis, N. B., Mathews, D. H., Ritz, J., Stombaugh, J., Weeks, K. M., Zirbel, C. L., Laederach, A. 2011; 17 (7): 1204-1212

    Abstract

    Nucleic acids are particularly amenable to structural characterization using chemical and enzymatic probes. Each individual structure mapping experiment reveals specific information about the structure and/or dynamics of the nucleic acid. Currently, there is no simple approach for making these data publically available in a standardized format. We therefore developed a standard for reporting the results of single nucleotide resolution nucleic acid structure mapping experiments, or SNRNASMs. We propose a schema for sharing nucleic acid chemical probing data that uses generic public servers for storing, retrieving, and searching the data. We have also developed a consistent nomenclature (ontology) within the Ontology of Biomedical Investigations (OBI), which provides unique identifiers (termed persistent URLs, or PURLs) for classifying the data. Links to standardized data sets shared using our proposed format along with a tutorial and links to templates can be found at http://snrnasm.bio.unc.edu.

    View details for DOI 10.1261/rna.2753211

    View details for Web of Science ID 000291683500002

    View details for PubMedID 21610212

  • Four Small Puzzles That Rosetta Doesn't Solve PLOS ONE Das, R. 2011; 6 (5)

    Abstract

    A complete macromolecule modeling package must be able to solve the simplest structure prediction problems. Despite recent successes in high resolution structure modeling and design, the Rosetta software suite fares poorly on small protein and RNA puzzles, some as small as four residues. To illustrate these problems, this manuscript presents Rosetta results for four well-defined test cases: the 20-residue mini-protein Trp cage, an even smaller disulfide-stabilized conotoxin, the reactive loop of a serine protease inhibitor, and a UUCG RNA tetraloop. In contrast to previous Rosetta studies, several lines of evidence indicate that conformational sampling is not the major bottleneck in modeling these small systems. Instead, approximations and omissions in the Rosetta all-atom energy function currently preclude discriminating experimentally observed conformations from de novo models at atomic resolution. These molecular "puzzles" should serve as useful model systems for developers wishing to make foundational improvements to this powerful modeling suite.

    View details for DOI 10.1371/journal.pone.0020044

    View details for Web of Science ID 000290793400036

    View details for PubMedID 21625446

  • A mutate-and-map strategy accurately infers the base pairs of a 35-nucleotide model RNA RNA-A PUBLICATION OF THE RNA SOCIETY Kladwang, W., Cordero, P., Das, R. 2011; 17 (3): 522-534

    Abstract

    We present a rapid experimental strategy for inferring base pairs in structured RNAs via an information-rich extension of classic chemical mapping approaches. The mutate-and-map method, previously applied to a DNA/RNA helix, systematically searches for single mutations that enhance the chemical accessibility of base-pairing partners distant in sequence. To test this strategy for structured RNAs, we have carried out mutate-and-map measurements for a 35-nt hairpin, called the MedLoop RNA, embedded within an 80-nt sequence. We demonstrate the synthesis of all 105 single mutants of the MedLoop RNA sequence and present high-throughput DMS, CMCT, and SHAPE modification measurements for this library at single-nucleotide resolution. The resulting two-dimensional data reveal visually clear, punctate features corresponding to RNA base pair interactions as well as more complex features; these signals can be qualitatively rationalized by comparison to secondary structure predictions. Finally, we present an automated, sequence-blind analysis that permits the confident identification of nine of the 10 MedLoop RNA base pairs at single-nucleotide resolution, while discriminating against all 1460 false-positive base pairs. These results establish the accuracy and information content of the mutate-and-map strategy and support its feasibility for rapidly characterizing the base-pairing patterns of larger and more complex RNA systems.

    View details for DOI 10.1261/rna.2516311

    View details for Web of Science ID 000287195900014

    View details for PubMedID 21239468

  • ROSETTA3: AN OBJECT-ORIENTED SOFTWARE SUITE FOR THE SIMULATION AND DESIGN OF MACROMOLECULES METHODS IN ENZYMOLOGY, VOL 487: COMPUTER METHODS, PT C Leaver-Fay, A., Tyka, M., Lewis, S. M., Lange, O. F., Thompson, J., Jacak, R., Kaufman, K., Renfrew, P. D., Smith, C. A., Sheffler, W., Davis, I. W., Cooper, S., Treuille, A., Mandell, D. J., Richter, F., Ban, Y. A., Fleishman, S. J., Corn, J. E., Kim, D. E., Lyskov, S., Berrondo, M., Mentzer, S., Popovic, Z., Havranek, J. J., Karanicolas, J., Das, R., Meiler, J., Kortemme, T., Gray, J. J., Kuhlman, B., Baker, D., Bradley, P. 2011: 545-574

    Abstract

    We have recently completed a full re-architecturing of the ROSETTA molecular modeling program, generalizing and expanding its existing functionality. The new architecture enables the rapid prototyping of novel protocols by providing easy-to-use interfaces to powerful tools for molecular modeling. The source code of this rearchitecturing has been released as ROSETTA3 and is freely available for academic use. At the time of its release, it contained 470,000 lines of code. Counting currently unpublished protocols at the time of this writing, the source includes 1,285,000 lines. Its rapid growth is a testament to its ease of use. This chapter describes the requirements for our new architecture, justifies the design decisions, sketches out central classes, and highlights a few of the common tasks that the new software can perform.

    View details for DOI 10.1016/S0076-6879(11)87019-9

    View details for Web of Science ID 000286532000019

    View details for PubMedID 21187238

  • Rosetta in CAPRI rounds 13-19 PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS Fleishman, S. J., Corn, J. E., Strauch, E. M., Whitehead, T. A., Andre, I., Thompson, J., Havranek, J. J., Das, R., Bradley, P., Baker, D. 2010; 78 (15): 3212-3218

    Abstract

    Modeling the conformational changes that occur on binding of macromolecules is an unsolved challenge. In previous rounds of the Critical Assessment of PRediction of Interactions (CAPRI), it was demonstrated that the Rosetta approach to macromolecular modeling could capture side chain conformational changes on binding with high accuracy. In rounds 13-19 we tested the ability of various backbone remodeling strategies to capture the main-chain conformational changes observed during binding events. These approaches span a wide range of backbone motions, from limited refinement of loops to relieve clashes in homologous docking, through extensive remodeling of loop segments, to large-scale remodeling of RNA. Although the results are encouraging, major improvements in sampling and energy evaluation are clearly required for consistent high accuracy modeling. Analysis of our failures in the CAPRI challenges suggest that conformational sampling at the termini of exposed beta strands is a particularly pressing area for improvement.

    View details for DOI 10.1002/prot.22784

    View details for Web of Science ID 000283565000020

    View details for PubMedID 20597089

  • A Mutate-and-Map Strategy for Inferring Base Pairs in Structured Nucleic Acids: Proof of Concept on a DNA/RNA Helix BIOCHEMISTRY Kladwang, W., Das, R. 2010; 49 (35): 7414-7416

    Abstract

    We propose a rapid chemical strategy for identifying base pairs in structured nucleic acid systems. The approach goes beyond traditional chemical mapping approaches by monitoring perturbations of each residue's chemical accessibility in response to systematic mutagenesis of residues that are distant in sequence but nearby in three dimensions. As a proof of concept, we present high-throughput dimethyl sulfate accessibility data for a chimeric DNA/RNA system in which every possible sequence variation and deletion in a 20 bp region has been synthesized and tested. The data demonstrate that 88% of the system's base pairs can be robustly inferred, with A/A and T/C DNA/RNA mismatches giving the strongest signals. These results point to the feasibility of rapid base pair inference in larger and more complex nucleic acid systems with unknown structure.

    View details for DOI 10.1021/bi101123g

    View details for Web of Science ID 000281305200002

    View details for PubMedID 20677780

  • Atomic accuracy in predicting and designing noncanonical RNA structure NATURE METHODS Das, R., Karanicolas, J., Baker, D. 2010; 7 (4): 291-294

    Abstract

    We present fragment assembly of RNA with full-atom refinement (FARFAR), a Rosetta framework for predicting and designing noncanonical motifs that define RNA tertiary structure. In a test set of thirty-two 6-20-nucleotide motifs, FARFAR recapitulated 50% of the experimental structures at near-atomic accuracy. Sequence redesign calculations recovered native bases at 65% of residues engaged in noncanonical interactions, and we experimentally validated mutations predicted to stabilize a signal recognition particle domain.

    View details for DOI 10.1038/NMETH.1433

    View details for Web of Science ID 000276150600018

    View details for PubMedID 20190761

  • A robust peak detection method for RNA structure inference by high-throughput contact mapping BIOINFORMATICS Kim, J., Yu, S., Shim, B., Kim, H., Min, H., Chung, E., Das, R., Yoon, S. 2009; 25 (9): 1137-1144

    Abstract

    For high-throughput prediction of the helical arrangements of large RNA molecules, an innovative method termed multiplexed hydroxyl radical (*OH) cleavage analysis (MOHCA) has been proposed. A key step in this promising technique is to detect peaks accurately from noisy radioactivity profiles. Since manual peak finding is laborious and prone to error, an automated peak detection method to improve the accuracy and throughput of MOHCA is required. Existing methods were not applicable to MOHCA due to their high false positive rates.We developed a two-step computational method that can detect peaks from MOHCA profiles in a robust manner. The first step exploits an ensemble of linear and non-linear signal processing techniques to find true peak candidates. In the second step, a binary classifier trained with the characteristics of true and false peaks is used to eliminate false peaks out of the peak candidates. We tested the proposed approach with 2002 MOHCA cleavage profiles and obtained the median recall, precision and F-measure values of 0.917, 0.750 and 0.830, respectively. Compared with the alternatives considered, the proposed method was able to handle false peaks substantially better, thus resulting in 51.0-71.8% higher median values of precision and F-measure.The software and supplementary data are available at http://dna.korea.ac.kr/pub/mohca.

    View details for DOI 10.1093/bioinformatics/btp110

    View details for Web of Science ID 000265523300007

    View details for PubMedID 19246511

  • Remeasuring the double helix SCIENCE Mathew-Fenn, R. S., Das, R., Harbury, P. A. 2008; 322 (5900): 446-449

    Abstract

    DNA is thought to behave as a stiff elastic rod with respect to the ubiquitous mechanical deformations inherent to its biology. To test this model at short DNA lengths, we measured the mean and variance of end-to-end length for a series of DNA double helices in solution, using small-angle x-ray scattering interference between gold nanocrystal labels. In the absence of applied tension, DNA is at least one order of magnitude softer than measured by single-molecule stretching experiments. Further, the data rule out the conventional elastic rod model. The variance in end-to-end length follows a quadratic dependence on the number of base pairs rather than the expected linear dependence, indicating that DNA stretching is cooperative over more than two turns of the DNA double helix. Our observations support the idea of long-range allosteric communication through DNA structure.

    View details for DOI 10.1126/science.1158881

    View details for Web of Science ID 000260094500048

    View details for PubMedID 18927394

  • Structural inference of native and partially folded RNA by high-throughput contact mapping PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Dast, R., Kudaravalli, M., Jonikas, M., Laederach, A., Fong, R., Schwans, J. P., Baker, D., Piccirilli, J. A., Altman, R. B., Herschlag, D. 2008; 105 (11): 4144-4149

    Abstract

    The biological behaviors of ribozymes, riboswitches, and numerous other functional RNA molecules are critically dependent on their tertiary folding and their ability to sample multiple functional states. The conformational heterogeneity and partially folded nature of most of these states has rendered their characterization by high-resolution structural approaches difficult or even intractable. Here we introduce a method to rapidly infer the tertiary helical arrangements of large RNA molecules in their native and non-native solution states. Multiplexed hydroxyl radical (.OH) cleavage analysis (MOHCA) enables the high-throughput detection of numerous pairs of contacting residues via random incorporation of radical cleavage agents followed by two-dimensional gel electrophoresis. We validated this technology by recapitulating the unfolded and native states of a well studied model RNA, the P4-P6 domain of the Tetrahymena ribozyme, at subhelical resolution. We then applied MOHCA to a recently discovered third state of the P4-P6 RNA that is stabilized by high concentrations of monovalent salt and whose partial order precludes conventional techniques for structure determination. The three-dimensional portrait of a compact, non-native RNA state reveals a well ordered subset of native tertiary contacts, in contrast to the dynamic but otherwise similar molten globule states of proteins. With its applicability to nearly any solution state, we expect MOHCA to be a powerful tool for illuminating the many functional structures of large RNA molecules and RNA/protein complexes.

    View details for DOI 10.1073/pnas.0709032105

    View details for Web of Science ID 000254263300015

    View details for PubMedID 18322008

  • Macromolecular modeling with Rosetta ANNUAL REVIEW OF BIOCHEMISTRY Das, R., Baker, D. 2008; 77: 363-382

    Abstract

    Advances over the past few years have begun to enable prediction and design of macromolecular structures at near-atomic accuracy. Progress has stemmed from the development of reasonably accurate and efficiently computed all-atom potential functions as well as effective conformational sampling strategies appropriate for searching a highly rugged energy landscape, both driven by feedback from structure prediction and design tests. A unified energetic and kinematic framework in the Rosetta program allows a wide range of molecular modeling problems, from fibril structure prediction to RNA folding to the design of new protein interfaces, to be readily investigated and highlights areas for improvement. The methodology enables the creation of novel molecules with useful functions and holds promise for accelerating experimental structural inference. Emerging connections to crystallographic phasing, NMR modeling, and lower-resolution approaches are described and critically assessed.

    View details for DOI 10.1146/annurev.biochem.77.062906.171838

    View details for Web of Science ID 000257596800016

    View details for PubMedID 18410248

  • High-resolution structure prediction and the crystallographic phase problem NATURE Qian, B., Raman, S., Das, R., Bradley, P., McCoy, A. J., Read, R. J., Baker, D. 2007; 450 (7167): 259-U7

    Abstract

    The energy-based refinement of low-resolution protein structure models to atomic-level accuracy is a major challenge for computational structural biology. Here we describe a new approach to refining protein structure models that focuses sampling in regions most likely to contain errors while allowing the whole structure to relax in a physically realistic all-atom force field. In applications to models produced using nuclear magnetic resonance data and to comparative models based on distant structural homologues, the method can significantly improve the accuracy of the structures in terms of both the backbone conformations and the placement of core side chains. Furthermore, the resulting models satisfy a particularly stringent test: they provide significantly better solutions to the X-ray crystallographic phase problem in molecular replacement trials. Finally, we show that all-atom refinement can produce de novo protein structure predictions that reach the high accuracy required for molecular replacement without any experimental phase information and in the absence of templates suitable for molecular replacement from the Protein Data Bank. These results suggest that the combination of high-resolution structure prediction with state-of-the-art phasing tools may be unexpectedly powerful in phasing crystallographic data for which molecular replacement is hindered by the absence of sufficiently accurate previous models.

    View details for DOI 10.1038/nature06249

    View details for Web of Science ID 000250746200052

    View details for PubMedID 17934447

  • Automated de novo prediction of native-like RNA tertiary structures PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Das, R., Baker, D. 2007; 104 (37): 14664-14669

    Abstract

    RNA tertiary structure prediction has been based almost entirely on base-pairing constraints derived from phylogenetic covariation analysis. We describe here a complementary approach, inspired by the Rosetta low-resolution protein structure prediction method, that seeks the lowest energy tertiary structure for a given RNA sequence without using evolutionary information. In a benchmark test of 20 RNA sequences with known structure and lengths of approximately 30 nt, the new method reproduces better than 90% of Watson-Crick base pairs, comparable with the accuracy of secondary structure prediction methods. In more than half the cases, at least one of the top five models agrees with the native structure to better than 4 A rmsd over the backbone. Most importantly, the method recapitulates more than one-third of non-Watson-Crick base pairs seen in the native structures. Tandem stacks of "sheared" base pairs, base triplets, and pseudoknots are among the noncanonical features reproduced in the models. In the cases in which none of the top five models were native-like, higher energy conformations similar to the native structures are still sampled frequently but not assigned low energies. These results suggest that modest improvements in the energy function, together with the incorporation of information from phylogenetic covariance, may allow confident and accurate structure prediction for larger and more complex RNA chains.

    View details for DOI 10.1073/pnas.0703836104

    View details for Web of Science ID 000249513000023

    View details for PubMedID 17726102

  • Structure prediction for CABP7 targets using extensive all-atom refinement with Rosetta@home PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS Das, R., Bin Qian, Raman, S., Vernon, R., Thompson, J., Bradley, P., Khare, S., Tyka, M. D., Bhat, D., Chivian, D., Kim, D. E., Sheffler, W. H., Malmstrom, L., Wollacott, A. M., Wang, C., Andre, I., Baker, D. 2007; 69: 118-128

    Abstract

    We describe predictions made using the Rosetta structure prediction methodology for both template-based modeling and free modeling categories in the Seventh Critical Assessment of Techniques for Protein Structure Prediction. For the first time, aggressive sampling and all-atom refinement could be carried out for the majority of targets, an advance enabled by the Rosetta@home distributed computing network. Template-based modeling predictions using an iterative refinement algorithm improved over the best existing templates for the majority of proteins with less than 200 residues. Free modeling methods gave near-atomic accuracy predictions for several targets under 100 residues from all secondary structure classes. These results indicate that refinement with an all-atom energy function, although computationally expensive, is a powerful method for obtaining accurate structure predictions.

    View details for DOI 10.1002/prot.21636

    View details for Web of Science ID 000251502400013

  • Determining the Mg2+ stoichiometry for folding an RNA metal ion core JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Das, R., Travers, K. J., Bai, Y., Herschlag, D. 2005; 127 (23): 8272-8273

    Abstract

    The folding and catalytic function of RNA molecules depend on their interactions with divalent metal ions, such as magnesium. As with every molecular process, the most basic knowledge required for understanding the close relationship of an RNA with its metal ions is the stoichiometry of the interaction. Unfortunately, inventories of the numbers of divalent ions associated with unfolded and folded RNA states have been unattainable. A common approach has been to interpret Hill coefficients fit to folding equilibria as the number of metal ions bound upon folding. However, this approach is vitiated by the presence of diffusely associated divalent ions in a dynamic ion atmosphere and by the likelihood of multiple transitions along a folding pathway. We demonstrate that the use of molar concentrations of background monovalent salt can alleviate these complications. These simplifying solution conditions allow a precise determination of the stoichiometry of the magnesium ions involved in folding the metal ion core of the P4-P6 domain of the Tetrahymena group I ribozyme. Hill analysis of hydroxyl radical footprinting data suggests that the P4-P6 RNA core folds cooperatively upon the association of two metal ions. This unexpectedly small stoichiometry is strongly supported by counting magnesium ions associated with the P4-P6 RNA via fluorescence titration and atomic emission spectroscopy. By pinpointing the metal ion stoichiometry, these measurements provide a critical but previously missing step in the thermodynamic dissection of the coupling between metal ion binding and RNA folding.

    View details for DOI 10.1021/ja051422h

    View details for Web of Science ID 000229751100020

    View details for PubMedID 15941246

  • SAFA: Semi-automated footprinting analysis software for high-throughput quantification of nucleic acid footprinting experiments RNA-A PUBLICATION OF THE RNA SOCIETY Das, R., Laederach, A., Pearlman, S. M., Herschlag, D., Altman, R. B. 2005; 11 (3): 344-354

    Abstract

    Footprinting is a powerful and widely used tool for characterizing the structure, thermodynamics, and kinetics of nucleic acid folding and ligand binding reactions. However, quantitative analysis of the gel images produced by footprinting experiments is tedious and time-consuming, due to the absence of informatics tools specifically designed for footprinting analysis. We have developed SAFA, a semi-automated footprinting analysis software package that achieves accurate gel quantification while reducing the time to analyze a gel from several hours to 15 min or less. The increase in analysis speed is achieved through a graphical user interface that implements a novel methodology for lane and band assignment, called "gel rectification," and an optimized band deconvolution algorithm. The SAFA software yields results that are consistent with published methodologies and reduces the investigator-dependent variability compared to less automated methods. These software developments simplify the analysis procedure for a footprinting gel and can therefore facilitate the use of quantitative footprinting techniques in nucleic acid laboratories that otherwise might not have considered their use. Further, the increased throughput provided by SAFA may allow a more comprehensive understanding of molecular interactions. The software and documentation are freely available for download at http://safa.stanford.edu.

    View details for DOI 10.1261/rna.7214405

    View details for Web of Science ID 000227190000011

    View details for PubMedID 15701734

Stanford Medicine Resources: