Bachelor of Arts, Harvard University (2011)
Doctor of Philosophy, New York University (2016)
Rhiju Das, Postdoctoral Faculty Sponsor
The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at http://www.rosettacommons.org.
View details for DOI 10.1038/s41592-020-0848-2
View details for PubMedID 32483333
RNA-Puzzles is a collective endeavor dedicated to the advancement and improvement of RNA 3D structure prediction. With agreement from crystallographers, the RNA structures are predicted by various groups before the publication of the crystal structures. We now report the prediction of six RNA sequences: four structures of nucleolytic ribozymes and two of riboswitches. Systematic protocols for comparing models and crystal structures are described and analyzed. In these six puzzles, we discuss a) the comparison between the automated web server and human experts; b) the prediction of coaxial stacking; c) the prediction of structural details and ligand binding; d) the development of novel prediction methods; and e) the potential improvements to be made. It is illustrated that correct coaxial stacking and tertiary contacts are key for the prediction of RNA architecture, while ligand binding modes can be only predicted with low resolution and accurate ligand binding prediction still remains out of reach. All the predicted models are available for the future development of force field parameters and the improvement of comparison and assessment tools.
View details for DOI 10.1261/rna.075341.120
View details for PubMedID 32371455
Many scientific disciplines rely on computational methods for data analysis, model generation, and prediction. Implementing these methods is often accomplished by researchers with domain expertise but without formal training in software engineering or computer science. This arrangement has led to underappreciation of sustainability and maintainability of scientific software tools developed in academic environments. Some software tools have avoided this fate, including the scientific library Rosetta. We use this software and its community as a case study to show how modern software development can be accomplished successfully, irrespective of subject area. Rosetta is one of the largest software suites for macromolecular modeling, with 3.1 million lines of code and many state-of-the-art applications. Since the mid 1990s, the software has been developed collaboratively by the RosettaCommons, a community of academics from over 60 institutions worldwide with diverse backgrounds including chemistry, biology, physiology, physics, engineering, mathematics, and computer science. Developing this software suite has provided us with more than two decades of experience in how to effectively develop advanced scientific software in a global community with hundreds of contributors. Here we illustrate the functioning of this development community by addressing technical aspects (like version control, testing, and maintenance), community-building strategies, diversity efforts, software dissemination, and user support. We demonstrate how modern computational research can thrive in a distributed collaborative community. The practices described here are independent of subject area and can be readily adopted by other software development communities.
View details for DOI 10.1371/journal.pcbi.1007507
View details for PubMedID 32365137
The site-specific incorporation of noncanonical monomers into polypeptides through genetic code reprogramming permits synthesis of bio-based products that extend beyond natural limits. To better enable such efforts, flexizymes (transfer RNA (tRNA) synthetase-like ribozymes that recognize synthetic leaving groups) have been used to expand the scope of chemical substrates for ribosome-directed polymerization. The development of design rules for flexizyme-catalyzed acylation should allow scalable and rational expansion of genetic code reprogramming. Here we report thesystematic synthesis of 37 substrates based on 4 chemically diverse scaffolds (phenylalanine, benzoic acid, heteroaromatic, and aliphatic monomers) with different electronic and steric factors. Of these substrates, 32 were acylated onto tRNA and incorporated into peptides by in vitro translation. Based on the design rules derived from this expanded alphabet, we successfully predicted the acylation of 6 additional monomers that could uniquely be incorporated into peptides and direct N-terminal incorporation of an aldehyde group for orthogonal bioconjugation reactions.
View details for DOI 10.1038/s41467-019-12916-w
View details for PubMedID 31704912
Picornaviral IRES elements are essential for initiating the cap-independent viral translation. However, three-dimensional structures of these elements remain elusive. Here, we report a 2.84-A resolution crystal structure of hepatitis A virus IRES domain V (dV) in complex with a synthetic antibody fragment-a crystallization chaperone. The RNA adopts a three-way junction structure, topologically organized by an adenine-rich stem-loop motif. Despite no obvious sequence homology, the dV architecture shows a striking similarity to a circularly permuted form of encephalomyocarditis virus J-K domain, suggesting a conserved strategy for organizing the domain architecture. Recurrence of the motif led us to use homology modeling tools to compute a 3-dimensional structure of the corresponding domain of foot-and-mouth disease virus, revealing an analogous domain organizing motif. The topological conservation observed among these IRESs and other viral domains implicates a structured three-way junction as an architectural scaffold to pre-organize helical domains for recruiting the translation initiation machinery.
View details for DOI 10.1038/s41467-019-11585-z
View details for PubMedID 31399592
We have determined the structure of the glutamine-II riboswitch ligand binding domain using X-ray crystallography. The structure was solved using a novel combination of homology modeling and molecular replacement. The structure comprises three coaxial helical domains, the central one of which is a pseudoknot with partial triplex character. The major groove of this helix provides the binding site for L-glutamine, which is extensively hydrogen bonded to the RNA. Atomic mutation of the RNA at the ligand binding site leads to loss of binding shown by isothermal titration calorimetry, explaining the specificity of the riboswitch. A metal ion also plays an important role in ligand binding. This is directly bonded to a glutamine carboxylate oxygen atom, and its remaining inner-sphere water molecules make hydrogen bonding interactions with the RNA.
View details for DOI 10.1093/nar/gkz539
View details for PubMedID 31216023
The three-dimensional structures of RNA molecules provide rich and often critical information for understanding their functions, including how they recognize small molecule and protein partners. Computational modeling of RNA 3D structure is becoming increasingly accurate, particularly with the availability of growing numbers of template structures already solved experimentally and the development of sequence alignment and 3D modeling tools to take advantage of this database. For several recent "RNA puzzle" blind modeling challenges, we have successfully identified useful template structures and achieved accurate structure predictions through homology modeling tools developed in the Rosetta software suite. We describe our semi-automated methodology here and walk through two illustrative examples: an adenine riboswitch aptamer, modeled from a template guanine riboswitch structure, and a SAM I/IV riboswitch aptamer, modeled from a template SAM I riboswitch structure.
View details for DOI 10.1016/bs.mie.2019.05.026
View details for PubMedID 31239046
Prediction of RNA structure from nucleotide sequence remains an unsolved grand challenge of biochemistry and requires distinct concepts from protein structure prediction. Despite extensive algorithmic development in recent years, modeling of noncanonical base pairs of new RNA structural motifs has not been achieved in blind challenges. We report a stepwise Monte Carlo (SWM) method with a unique add-and-delete move set that enables predictions of noncanonical base pairs of complex RNA structures. A benchmark of 82 diverse motifs establishes the method's general ability to recover noncanonical pairs ab initio, including multistrand motifs that have been refractory to prior approaches. In a blind challenge, SWM models predicted nucleotide-resolution chemical mapping and compensatory mutagenesis experiments for three in vitro selected tetraloop/receptors with previously unsolved structures (C7.2, C7.10, and R1). As a final test, SWM blindly and correctly predicted all noncanonical pairs of a Zika virus double pseudoknot during a recent community-wide RNA-Puzzle. Stepwise structure formation, as encoded in the SWM method, enables modeling of noncanonical RNA structure in a variety of previously intractable problems.
View details for PubMedID 29806027
The modulation of protein-protein interactions (PPIs) by means of creating or stabilizing secondary structure conformations is a rapidly growing area of research. Recent success in the inhibition of difficult PPIs by secondary structure mimetics also points to potential limitations, because often, specific cases require tertiary structure mimetics. To streamline protein structure-based inhibitor design, we have previously described the examination of protein complexes in the Protein Data Bank where ?-helices or ?-strands form critical contacts. Here, we examined coiled coils and helix bundles that mediate complex formation to create a platform for the discovery of potential tertiary structure mimetics. Though there has been extensive analysis of coiled coil motifs, the interactions between pre-formed coiled coils and globular proteins have not been systematically analyzed. This article identifies critical features of these helical interfaces with respect to coiled coil and other helical PPIs. We expect the analysis to prove useful for the rational design of modulators of this fundamental class of protein assemblies.
View details for DOI 10.1021/jacs.5b05527
View details for Web of Science ID 000361502800021
View details for PubMedID 26302018
Protein-protein interactions (PPIs) are emerging as attractive targets for drug design because of their central role in directing normal and aberrant cellular functions. These interactions were once considered "undruggable" because their large and dynamic interfaces make small molecule inhibitor design challenging. However, landmark advances in computational analysis, fragment screening and molecular design have enabled development of a host of promising strategies to address the fundamental molecular recognition challenge. An attractive approach for targeting PPIs involves mimicry of protein domains that are critical for complex formation. This approach recognizes that protein subdomains or protein secondary structures are often present at interfaces and serve as organized scaffolds for the presentation of side chain groups that engage the partner protein(s). Design of protein domain mimetics is in principle rather straightforward but is enabled by a host of computational strategies that provide predictions of important residues that should be mimicked. Herein we describe a workflow proceeding from interaction network analysis, to modeling a complex structure, to identifying a high-affinity sub-structure, to developing interaction inhibitors. We apply the design procedure to peptidomimetic inhibitors of Ras-mediated signaling.
View details for DOI 10.1016/j.ejmech.2014.09.047
View details for Web of Science ID 000353730900043
View details for PubMedID 25253637
The development of inhibitors for protein-protein interactions frequently involves the mimicry of secondary structure motifs. While helical protein-protein interactions have been heavily targeted, a similar level of success for the inhibition of ?-strand and ?-sheet rich interfaces has been elusive. We describe an assessment of the full range of ?-strand interfaces whose high-resolution structures are available in the Protein Data Bank. This analysis identifies complexes where a ?-stand or ?-sheet contributes significantly to binding. The results highlight the molecular recognition complexity in strand-mediated interactions relative to helical interfaces and offer guidelines for the construction of ?-strand and ?-sheet mimics as ligands for protein receptors. The online data set will potentially serve as an entry-point to new classes of protein-protein interaction inhibitors.
View details for DOI 10.1021/cb500241y
View details for Web of Science ID 000340517500017
View details for PubMedID 24870802
Helix-coil transition theory connects observable properties of the ?-helix to an ensemble of microstates and provides a foundation for analyzing secondary structure formation in proteins. Classical models account for cooperative helix formation in terms of an energetically demanding nucleation event (described by the ? constant) followed by a more facile propagation reaction, with corresponding s constants that are sequence dependent. Extensive studies of folding and unfolding in model peptides have led to the determination of the propagation constants for amino acids. However, the role of individual side chains in helix nucleation has not been separately accessible, so the ? constant is treated as independent of sequence. We describe here a synthetic model that allows the assessment of the role of individual amino acids in helix nucleation. Studies with this model lead to the surprising conclusion that widely accepted scales of helical propensity are not predictive of helix nucleation. Residues known to be helix stabilizers or breakers in propagation have only a tenuous relationship to residues that favor or disfavor helix nucleation.
View details for DOI 10.1073/pnas.1322833111
View details for Web of Science ID 000335477300040
View details for PubMedID 24753597
HippDB catalogs every protein-protein interaction whose structure is available in the Protein Data Bank and which exhibits one or more helices at the interface. The Web site accepts queries on variables such as helix length and sequence, and it provides computational alanine scanning and change in solvent-accessible surface area values for every interfacial residue. HippDB is intended to serve as a starting point for structure-based small molecule and peptidomimetic drug development.HippDB is freely available on the web at http://www.nyu.edu/projects/arora/hippdb. The Web site is implemented in PHP, MySQL and Apache. Source code freely available for download at http://code.google.com/p/helidb, implemented in Perl and supported on Linux.firstname.lastname@example.org.
View details for DOI 10.1093/bioinformatics/btt483
View details for Web of Science ID 000325997500025
View details for PubMedID 23958730
Peptidomimetics are classes of molecules that mimic structural and functional attributes of polypeptides. Peptidomimetic oligomers can frequently be synthesized using efficient solid phase synthesis procedures similar to peptide synthesis. Conformationally ordered peptidomimetic oligomers are finding broad applications for molecular recognition and for inhibiting protein-protein interactions. One critical limitation is the limited set of design tools for identifying oligomer sequences that can adopt desired conformations. Here, we present expansions to the ROSETTA platform that enable structure prediction and design of five non-peptidic oligomer scaffolds (noncanonical backbones), oligooxopiperazines, oligo-peptoids, [Formula: see text]-peptides, hydrogen bond surrogate helices and oligosaccharides. This work is complementary to prior additions to model noncanonical protein side chains in ROSETTA. The main purpose of our manuscript is to give a detailed description to current and future developers of how each of these noncanonical backbones was implemented. Furthermore, we provide a general outline for implementation of new backbone types not discussed here. To illustrate the utility of this approach, we describe the first tests of the ROSETTA molecular mechanics energy function in the context of oligooxopiperazines, using quantum mechanical calculations as comparison points, scanning through backbone and side chain torsion angles for a model peptidomimetic. Finally, as an example of a novel design application, we describe the automated design of an oligooxopiperazine that inhibits the p53-MDM2 protein-protein interaction. For the general biological and bioengineering community, several noncanonical backbones have been incorporated into web applications that allow users to freely and rapidly test the presented protocols (http://rosie.rosettacommons.org). This work helps address the peptidomimetic community's need for an automated and expandable modeling tool for noncanonical backbones.
View details for DOI 10.1371/journal.pone.0067051
View details for Web of Science ID 000323110600005
View details for PubMedID 23869206
View details for PubMedCentralID PMC3712014