Vijay Pande, Henry Dreyfus Professor of Chemistry and, by courtesy, of Structural Biology and Computer Science, also currently directs of the Stanford Program in Biophysics and the Folding@home Distribtued Computing project. His research centers on novel cloud computing simulation techniques to address problems in chemical biology. In particular, he has pioneered distributed computing methodology to break fundamental barriers in the simulation of protein and nucleic acid kinetics and thermodynamics. As director of the Folding@home project (, Prof. Pande has, for the first time, directly simulated protein folding dynamics, making quantitative comparisons with experimental results, often considered a “holy grail” of computational biology. His current research also includes novel computational methods for drug design, especially in the area of protein misfolding and associated diseases such as Alzheimer’s and Huntington’s Disease.

Professor Pande studied physics at Princeton University (B.A. 1992), where he was first introduced to biophysical questions, especially in undergraduate research with Nobel Laureate P. Anderson. His doctoral research in physics under Profs. T. Tanaka and A. Grosberg at MIT (Ph.D. 1995) centered on statistical mechanical models of protein folding, suggesting new ways to design protein sequences for stability and folding properties. As a Miller Fellow under Prof. D. Rokhsar at UC Berkeley, Prof. Pande extended this methodology to examine atomistic protein models, laying the foundations for his work at Stanford University. Among numerous awards, Prof. Pande has received the Biophysical Society’s Bárány Award for Young Investigators and Protein Society’s Irving Sigal Young Investigator Award, and was named to MIT’s TR100 and elected a Fellow of the American Physical Society.

The Pande research group develops and applies new theoretical methods to understand the physical properties of biological molecules such as proteins, nucleic acids and lipid membranes, using this understanding to design synthetic systems including small-molecule therapeutics. In particular, the group examines the self-assembly properties of biomolecules. For example, how do protein and RNA molecules fold? How do proteins misfold and aggregate? How can we use this understanding to tackle misfolding related degeneration and develop small molecules to inhibit disease processes?

As these phenomena are complex, spanning molecular to mesoscopic lengths and nanosecond to millisecond timescales, the lab employs a variety of methods, including statistical mechanical analytic models, Markov State Models, and statistical and informatic methods. Other tools include Monte Carlo, Langevin dynamics, and molecular dynamics computer simulations on workstations and massively parallel supercomputers, superclusters, and worldwide distributed computing. The group has also done extensive work in the application of machine learning, pioneering traditional and deep learning approaches to cheminformatics, biophysics and drug design.

For example, simulations in all-atom detail on experimentally relevant timescales (milliseconds to seconds) have produced specific predictions of the structural and physical chemical nature of protein aggregation involved in Alzheimer’s and Huntington’s diseases. These results have fed into computational small molecule drug design methods, yielding interesting new chemical entities.

Since such problems are extremely computationally demanding, the group developed a distributed computing project for protein folding dynamics. Since its launch in October 2000, Folding@Home has attracted more than 4,000,000 PCs, and today is recognized as the most powerful supercluster in the world. Such enormous computational resources have allowed simulations of unprecedented folding timescales and statistical precision and accuracy. For more details, please visit

Academic Appointments

Administrative Appointments

  • General Partner, Andreessen Horowitz (2015 - Present)
  • Chair, Program in Biophysics, Stanford University (2008 - 2015)
  • Director, Folding@home Distributed Computing Project (2000 - Present)

Honors & Awards

  • Distinguished Lecture in Theoretical and Computational Chemistry, University of California, San Diego (2015)
  • Michael and Kate Bárány Award for Young Investigators, Biophysical Society (2012)
  • Irving Sigal Young Investigator Award, Protein Society (2006)
  • TR100 Top 100 Young Innovators, MIT Technology Review (2002)
  • Delano Award for Computational Biosciences, American Society for Biochemistry and Molecular Biology (2015)
  • Best Mentor, Stanford University Postdoctoral Association (2013)
  • Best of 2012 Papers Collection Citation, Biophysical Society (2013)
  • ACS Thomas Kuhn Paradigm Shift Award, American Chemical Society (2010)
  • Fellow, American Physical Society (2008)
  • Netxplorateur of the Year 2008, Netexplo (“Netexplorateur” until late 2011) (2008)
  • Guinness World Record, Folding@home First to a Petaflop, Guiness World Records (2007)
  • Global Indus Technovators Award, IBC@MIT, Massachusetts Institute of Technology (2004)
  • Dreyfus Teacher-Scholar Award, Camille & Henry Dreyfus Foundation (2003)
  • Frederick E. Terman Fellow, Stanford University (2002-2005)
  • Kawasaki Prize in Experimental Physics, Princeton University (1992)

Boards, Advisory Committees, Professional Organizations

  • Scientific Advisory Board Member, Schrodinger LLC (2014 - Present)
  • Distinguished Visiting Professor of Computer Science, Andreesen Horowitz (2014 - 2015)
  • Non-executive Chairman, Board of Directors, Reanimed Pharmaceuticals (2013 - Present)
  • Scientific Advisory Board Member, 1729, Palo Alto, CA (2013 - Present)
  • Scientific Advisory Board Member, Code For India ( (2013 - Present)
  • Editorial Board Member, Journal of Chemical Physics (2013 - 2015)
  • Scientific Advisory Board Member, OpenEye Scientific Software (2012 - 2013)
  • Executive Committee Member, US Protein Folding Consortium (2011 - Present)
  • Visiting Committee Member, Johns Hopkins Biophysics Department (2011 - 2011)
  • Visiting Committee Member, UCSF Biophysics Program (2011 - 2011)
  • Editorial Board Member, Computational Science & Discovery (2010 - 2015)
  • Advisory Board Member, Journal of Chemical Theory and Computation (2009 - Present)
  • Member, Board of Directors, ClusterCorp (now StackIQ) (2009 - 2011)
  • Scientific Advisory Board Member, Counsyl, Inc. (2008 - Present)
  • Associate Editor, PLoS Computational Biology (2008 - 2015)
  • Scientific Advisory Board Member, Numerate, Inc. (2008 - 2012)
  • Scientific Advisory Board Member, Discovery Engine (2008 - 2010)
  • Scientific Advisory Board Member, Clearspeed, Inc. (2005 - 2007)
  • Scientific Advisory Board Member, Pharmix, Inc. (2004 - 2006)
  • Consultant, Fujitsu Computer Corporation (2003 - 2006)
  • Scientific Advisory Board Member, Acumen Pharmaceuticals (2002 - 2009)
  • Consultant, Hitachi Computer Corporation (2002 - 2002)
  • Scientific Advisory Board Member, Protein Mechanics, Inc. (2000 - 2004)
  • Scientific Advisory Board Member, Omnipod Distributed Computing Project (2000 - 2001)

Professional Education

  • Postdoc, MIT Physics Department and Center for Materials Science & Engineering, Physics (1996)
  • PhD, MIT, Physics (1995)
  • BA, Princeton University, Physics (1992)

Research & Scholarship

Current Research and Scholarly Interests

The central theme of our research is to develop and apply novel theoretical methods to understand the physical properties of biological molecules, such as proteins, nucleic acids, and lipid membranes, and to apply this understanding to design novel synthetic systems, including small molecule therapeutics. In particular, we are interested in the self-assembly properties of biomolecules: for example, how do protein and RNA molecules fold? How do proteins misfold and aggregate and how can we use our understanding of this process to tackle misfolding related diseases, such as Alzheimer's or Huntington's Disease? How can we design or discover novel small molecules to inhibit this process?

As these phenomena are complex, spanning from the molecular to mesoscopic length scales and the nanosecond to millisecond timescales, our research employs a variety of methods, including statistical mechanical analytic models, Markov State Models, and statistical and informatic methods, as well as Monte Carlo, Langevin dynamics, and molecular dynamics computer simulations on workstations and massively parallel supercomputers, superclusters, and large-scale worldwide distributed computing (see Our work also touches closely in parts with applications of Bayesian statistics to statistical mechanics, as well as novel means for computational small molecule (drug) design (such as novel methods for docking and free energy calculation).

For example, we are currently investigating the nature of protein folding and misfolding, relevant for diseases such as Alzheimer’s and Huntington’s Disease. We have performed simulations of these processes, in all-atom detail on experimentally relevant timescales (milliseconds to seconds), yielding specific predictions of the structural and physical chemical nature of protein aggregation involved in these diseases. These simulation results have then fed into novel computational small molecule drug design methods, yielding novel chemical entities with important and interesting impact.

Since such problems are extremely computationally demanding, we have developed distributed computing projects for protein folding dynamics ("Folding@Home": which has attracted over 4,000,000 PCs since the project's beginning in October 1, 2000 and today is recognized as the most powerful supercomputer/supercluster in the world. Such enormous computational resources have allowed us to simulate unprecedented folding timescales (microseconds to milliseconds) and statistical precision and accuracy (such as very accurate and precise free energy calculations). For more details, please see

Finally, we also have done extensive work in the application of Machine Learning (ML) to Chemistry and Biophysics. We have pioneered traditional and deep learning approaches to cheminformatics and biophysics. In particular, ML methods have played a key role in MSM methods. Moreover, we have been pioneering ML approaches, especially deep learning, for drug design and related areas. Our vision is that we are just scratching the surface of how ML can impact Chemistry and are positioned to be leaders in this burgeoning field.


2017-18 Courses

Stanford Advisees

Graduate and Fellowship Programs


All Publications

  • tICA-Metadynamics: Accelerating Metadynamics by Using Kinetically Selected Collective Variables. Journal of chemical theory and computation M Sultan, M., Pande, V. S. 2017


    Metadynamics is a powerful enhanced molecular dynamics sampling method that accelerates simulations by adding history-dependent multidimensional Gaussians along selective collective variables (CVs). In practice, choosing a small number of slow CVs remains challenging due to the inherent high dimensionality of biophysical systems. Here we show that time-structure based independent component analysis (tICA), a recent advance in Markov state model literature, can be used to identify a set of variationally optimal slow coordinates for use as CVs for Metadynamics. We show that linear and nonlinear tICA-Metadynamics can complement existing MD studies by explicitly sampling the system's slowest modes and can even drive transitions along the slowest modes even when no such transitions are observed in unbiased simulations.

    View details for DOI 10.1021/acs.jctc.7b00182

    View details for PubMedID 28383914

  • Building a More Predictive Protein Force Field: A Systematic and Reproducible Route to AMBER-FB15 JOURNAL OF PHYSICAL CHEMISTRY B Wang, L., McKiernan, K. A., Gomes, J., Beauchamp, K. A., Head-Gordon, T., Rice, J. E., Swope, W. C., Martinez, T. J., Pande, V. S. 2017; 121 (16): 4023-4039


    The increasing availability of high-quality experimental data and first-principles calculations creates opportunities for developing more accurate empirical force fields for simulation of proteins. We developed the AMBER-FB15 protein force field by building a high-quality quantum chemical data set consisting of comprehensive potential energy scans and employing the ForceBalance software package for parameter optimization. The optimized potential surface allows for more significant thermodynamic fluctuations away from local minima. In validation studies where simulation results are compared to experimental measurements, AMBER-FB15 in combination with the updated TIP3P-FB water model predicts equilibrium properties with equivalent accuracy, and temperature dependent properties with significantly improved accuracy, in comparison with published models. We also discuss the effect of changing the protein force field and water model on the simulation results.

    View details for DOI 10.1021/acs.jpcb.7b02320

    View details for Web of Science ID 000400534200012

    View details for PubMedID 28306259

  • Computationally Discovered Potentiating Role of Glycans on NMDA Receptors SCIENTIFIC REPORTS Sinitskiy, A. V., Stanley, N. H., Hackos, D. H., Hanson, J. E., Sellers, B. D., Pande, V. S. 2017; 7


    N-methyl-D-aspartate receptors (NMDARs) are glycoproteins in the brain central to learning and memory. The effects of glycosylation on the structure and dynamics of NMDARs are largely unknown. In this work, we use extensive molecular dynamics simulations of GluN1 and GluN2B ligand binding domains (LBDs) of NMDARs to investigate these effects. Our simulations predict that intra-domain interactions involving the glycan attached to residue GluN1-N440 stabilize closed-clamshell conformations of the GluN1 LBD. The glycan on GluN2B-N688 shows a similar, though weaker, effect. Based on these results, and assuming the transferability of the results of LBD simulations to the full receptor, we predict that glycans at GluN1-N440 might play a potentiator role in NMDARs. To validate this prediction, we perform electrophysiological analysis of full-length NMDARs with a glycosylation-preventing GluN1-N440Q mutation, and demonstrate an increase in the glycine EC50 value. Overall, our results suggest an intramolecular potentiating role of glycans on NMDA receptors.

    View details for DOI 10.1038/srep44578

    View details for Web of Science ID 000398371800001

    View details for PubMedID 28378791

  • Markov modeling reveals novel intracellular modulation of the human TREK-2 selectivity filter SCIENTIFIC REPORTS Harrigan, M. P., McKiernan, K. A., Shanmugasundaram, V., Denny, R. A., Pande, V. S. 2017; 7


    Two-pore domain potassium (K2P) channel ion conductance is regulated by diverse stimuli that directly or indirectly gate the channel selectivity filter (SF). Recent crystal structures for the TREK-2 member of the K2P family reveal distinct "up" and "down" states assumed during activation via mechanical stretch. We performed 195 μs of all-atom, unbiased molecular dynamics simulations of the TREK-2 channel to probe how membrane stretch regulates the SF gate. Markov modeling reveals a novel "pinched" SF configuration that stretch activation rapidly destabilizes. Free-energy barrier heights calculated for critical steps in the conduction pathway indicate that this pinched state impairs ion conduction. Our simulations predict that this low-conductance state is accessed exclusively in the compressed, "down" conformation in which the intracellular helix arrangement allosterically pinches the SF. By explicitly relating structure to function, we contribute a critical piece of understanding to the evolving K2P puzzle.

    View details for DOI 10.1038/s41598-017-00256-y

    View details for Web of Science ID 000398162600008

    View details for PubMedID 28377596

  • Low Data Drug Discovery with One-Shot Learning ACS CENTRAL SCIENCE Altae-Tran, H., Ramsundar, B., Pappu, A. S., Pande, V. 2017; 3 (4): 283-293


    Recent advances in machine learning have made significant contributions to drug discovery. Deep neural networks in particular have been demonstrated to provide significant boosts in predictive power when inferring the properties and activities of small-molecule compounds (Ma, J. et al. J. Chem. Inf.2015, 55, 263-274). However, the applicability of these techniques has been limited by the requirement for large amounts of training data. In this work, we demonstrate how one-shot learning can be used to significantly lower the amounts of data required to make meaningful predictions in drug discovery applications. We introduce a new architecture, the iterative refinement long short-term memory, that, when combined with graph convolutional neural networks, significantly improves learning of meaningful distance metrics over small-molecules. We open source all models introduced in this work as part of DeepChem, an open-source framework for deep-learning in drug discovery (Ramsundar, B., 2016).

    View details for DOI 10.1021/acscentsci.6b00367

    View details for Web of Science ID 000400324200009

    View details for PubMedID 28470045

  • Efficient Gaussian Density Formulation of Volume and Surface Areas of Macromolecules on Graphical Processing Units JOURNAL OF COMPUTATIONAL CHEMISTRY Zhang, B., Kilburg, D., Eastman, P., Pande, V. S., Gallicchio, E. 2017; 38 (10): 740-752


    We present an algorithm to efficiently compute accurate volumes and surface areas of macromolecules on graphical processing unit (GPU) devices using an analytic model which represents atomic volumes by continuous Gaussian densities. The volume of the molecule is expressed by means of the inclusion-exclusion formula, which is based on the summation of overlap integrals among multiple atomic densities. The surface area of the molecule is obtained by differentiation of the molecular volume with respect to atomic radii. The many-body nature of the model makes a port to GPU devices challenging. To our knowledge, this is the first reported full implementation of this model on GPU hardware. To accomplish this, we have used recursive strategies to construct the tree of overlaps and to accumulate volumes and their gradients on the tree data structures so as to minimize memory contention. The algorithm is used in the formulation of a surface area-based non-polar implicit solvent model implemented as an open source plug-in (named GaussVol) for the popular OpenMM library for molecular mechanics modeling. GaussVol is 50 to 100 times faster than our best optimized implementation for the CPUs, achieving speeds in excess of 100 ns/day with 1 fs time-step for protein-sized systems on commodity GPUs. © 2017 Wiley Periodicals, Inc.

    View details for DOI 10.1002/jcc.24745

    View details for Web of Science ID 000394877600010

    View details for PubMedID 28160511

  • Atomistic structural ensemble refinement reveals non-native structure stabilizes a sub-millisecond folding intermediate of CheY. Scientific reports Shi, J., Nobrega, R. P., Schwantes, C., Kathuria, S. V., Bilsel, O., Matthews, C. R., Lane, T. J., Pande, V. S. 2017; 7: 44116-?


    The dynamics of globular proteins can be described in terms of transitions between a folded native state and less-populated intermediates, or excited states, which can play critical roles in both protein folding and function. Excited states are by definition transient species, and therefore are difficult to characterize using current experimental techniques. Here, we report an atomistic model of the excited state ensemble of a stabilized mutant of an extensively studied flavodoxin fold protein CheY. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structure of the excited state ensemble. This prior was then refined against small-angle X-ray scattering (SAXS) data employing an established method (EROS). The most striking feature of the resulting excited state ensemble was an unstructured N-terminus stabilized by non-native contacts in a conformation that is topologically simpler than the native state. Using these results, we then predict incisive single molecule FRET experiments as a means of model validation. This study demonstrates the paradigm of uniting simulation and experiment in a statistical model to study the structure of protein excited states and rationally design validating experiments.

    View details for DOI 10.1038/srep44116

    View details for PubMedID 28272524

    View details for PubMedCentralID PMC5341065

  • Ward Clustering Improves Cross-Validated Markov State Models of Protein Folding. Journal of chemical theory and computation Husic, B. E., Pande, V. S. 2017


    Markov state models (MSMs) are a powerful framework for analyzing protein dynamics. MSMs require the decomposition of conformation space into states via clustering, which can be cross-validated when a prediction method is available for the clustering method. We present an algorithm for predicting cluster assignments of new data points with Ward's minimum variance method. We then show that clustering with Ward's method produces better or equivalent cross-validated MSMs for protein folding than other clustering algorithms.

    View details for DOI 10.1021/acs.jctc.6b01238

    View details for PubMedID 28195713

  • Identification of simple reaction coordinates from complex dynamics. journal of chemical physics McGibbon, R. T., Husic, B. E., Pande, V. S. 2017; 146 (4): 044109-?


    Reaction coordinates are widely used throughout chemical physics to model and understand complex chemical transformations. We introduce a definition of the natural reaction coordinate, suitable for condensed phase and biomolecular systems, as a maximally predictive one-dimensional projection. We then show that this criterion is uniquely satisfied by a dominant eigenfunction of an integral operator associated with the ensemble dynamics. We present a new sparse estimator for these eigenfunctions which can search through a large candidate pool of structural order parameters and build simple, interpretable approximations that employ only a small number of these order parameters. Example applications with a small molecule's rotational dynamics and simulations of protein conformational change and folding show that this approach can filter through statistical noise to identify simple reaction coordinates from complex dynamics.

    View details for DOI 10.1063/1.4974306

    View details for PubMedID 28147508

    View details for PubMedCentralID PMC5272828

  • MSMBuilder: Statistical Models for Biomolecular Dynamics BIOPHYSICAL JOURNAL Harrigan, M. P., Sultan, M. M., Hernandez, C. X., Husic, B. E., Eastman, P., Schwantes, C. R., Beauchamp, K. A., McGibbon, R. T., Pande, V. S. 2017; 112 (1): 10-15


    MSMBuilder is a software package for building statistical models of high-dimensional time-series data. It is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change. MSMBuilder is named for its ability to construct Markov state models (MSMs), a class of models that has gained favor among computational biophysicists. In addition to both well-established and newer MSM methods, the package includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis. MSMBuilder boasts an easy to use command-line interface, as well as clear and consistent abstractions through its Python application programming interface. MSMBuilder was developed with careful consideration for compatibility with the broader machine learning community by following the design of scikit-learn. The package is used primarily by practitioners of molecular dynamics, but is just as applicable to other computational or experimental time-series measurements.

    View details for Web of Science ID 000392163500004

    View details for PubMedID 28076801

    View details for PubMedCentralID PMC5232355

  • Training and Validation of a Liquid-Crystalline Phospholipid Bilayer Force Field JOURNAL OF CHEMICAL THEORY AND COMPUTATION McKiernan, K. A., Wang, L., Pande, V. S. 2016; 12 (12): 5960-5967


    We present a united-atom model (gb-fb15) for the molecular dynamics simulation of hydrated liquid-crystalline dipalmitoylphosphatidylcholine (DPPC) phospholipid bilayers. This model was constructed through the parameter-space minimization of a regularized least-squares objective function via the ForceBalance method. The objective function was computed using a training set of experimental bilayer area per lipid and deuterium order parameter. This model was validated by comparison to experimental volume per lipid, X-ray scattering form factor, thermal area expansivity, area compressibility modulus, and lipid lateral diffusion coefficient. These comparisons demonstrate that gb-fb15 is robust to temperature variation and an improvement over the original model for both the training and validation properties.

    View details for DOI 10.1021/acs.jctc.6b00801

    View details for Web of Science ID 000389866500025

    View details for PubMedID 27786477

  • Optimized parameter selection reveals trends in Markov state models for protein folding JOURNAL OF CHEMICAL PHYSICS Husic, B. E., McGibbon, R. T., Sultan, M. M., Pande, V. S. 2016; 145 (19)


    As molecular dynamics simulations access increasingly longer time scales, complementary advances in the analysis of biomolecular time-series data are necessary. Markov state models offer a powerful framework for this analysis by describing a system's states and the transitions between them. A recently established variational theorem for Markov state models now enables modelers to systematically determine the best way to describe a system's dynamics. In the context of the variational theorem, we analyze ultra-long folding simulations for a canonical set of twelve proteins [K. Lindorff-Larsen et al., Science 334, 517 (2011)] by creating and evaluating many types of Markov state models. We present a set of guidelines for constructing Markov state models of protein folding; namely, we recommend the use of cross-validation and a kinetically motivated dimensionality reduction step for improved descriptions of folding dynamics. We also warn that precise kinetics predictions rely on the features chosen to describe the system and pose the description of kinetic uncertainty across ensembles of models as an open issue.

    View details for DOI 10.1063/1.4967809

    View details for Web of Science ID 000388956900007

    View details for PubMedID 27875868

    View details for PubMedCentralID PMC5116026

  • Computational Modeling of beta-Secretase 1 (BACE-1) Inhibitors Using Ligand Based Approaches JOURNAL OF CHEMICAL INFORMATION AND MODELING Subramanian, G., Ramsundar, B., Pande, V., Denny, R. A. 2016; 56 (10): 1936-1949


    The binding affinities (IC50) reported for diverse structural and chemical classes of human β-secretase 1 (BACE-1) inhibitors in literature were modeled using multiple in silico ligand based modeling approaches and statistical techniques. The descriptor space encompasses simple binary molecular fingerprint, one- and two-dimensional constitutional, physicochemical, and topological descriptors, and sophisticated three-dimensional molecular fields that require appropriate structural alignments of varied chemical scaffolds in one universal chemical space. The affinities were modeled using qualitative classification or quantitative regression schemes involving linear, nonlinear, and deep neural network (DNN) machine-learning methods used in the scientific literature for quantitative-structure activity relationships (QSAR). In a departure from tradition, ∼20% of the chemically diverse data set (205 compounds) was used to train the model with the remaining ∼80% of the structural and chemical analogs used as part of an external validation (1273 compounds) and prospective test (69 compounds) sets respectively to ascertain the model performance. The machine-learning methods investigated herein performed well in both the qualitative classification (∼70% accuracy) and quantitative IC50 predictions (RMSE ∼ 1 log). The success of the 2D descriptor based machine learning approach when compared against the 3D field based technique pursued for hBACE-1 inhibitors provides a strong impetus for systematically applying such methods during the lead identification and optimization efforts for other protein families as well.

    View details for DOI 10.1021/acs.jcim.6b00290

    View details for Web of Science ID 000386315000006

    View details for PubMedID 27689393

  • Advanced Potential Energy Surfaces for Molecular Simulation JOURNAL OF PHYSICAL CHEMISTRY B Albaugh, A., Boateng, H. A., Bradshaw, R. T., Demerdash, O. N., Dziedzic, J., Mao, Y., Margul, D. T., Swails, J., Zeng, Q., Case, D. A., Eastman, P., Wang, L., Essex, J. W., Head-Gordon, M., Pande, V. S., Ponder, J. W., Shao, Y., Skylaris, C., Todorov, I. T., Tuckerman, M. E., Head-Gordon, T. 2016; 120 (37): 9811-9832


    Advanced potential energy surfaces are defined as theoretical models that explicitly include many-body effects that transcend the standard fixed-charge, pairwise-additive paradigm typically used in molecular simulation. However, several factors relating to their software implementation have precluded their widespread use in condensed-phase simulations: the computational cost of the theoretical models, a paucity of approximate models and algorithmic improvements that can ameliorate their cost, underdeveloped interfaces and limited dissemination in computational code bases that are widely used in the computational chemistry community, and software implementations that have not kept pace with modern high-performance computing (HPC) architectures, such as multicore CPUs and modern graphics processing units (GPUs). In this Feature Article we review recent progress made in these areas, including well-defined polarization approximations and new multipole electrostatic formulations, novel methods for solving the mutual polarization equations and increasing the MD time step, combining linear-scaling electronic structure methods with new QM/MM methods that account for mutual polarization between the two regions, and the greatly improved software deployment of these models and methods onto GPU and CPU hardware platforms. We have now approached an era where multipole-based polarizable force fields can be routinely used to obtain computational results comparable to state-of-the-art density functional theory while reaching sampling statistics that are acceptable when compared to that obtained from simpler fixed partial charge force fields.

    View details for DOI 10.1021/acs.jpcb.6b06414

    View details for Web of Science ID 000384034100001

    View details for PubMedID 27513316

  • Bayesian regularization of the length of memory in reversible sequences JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY Bacallado, S., Pande, V., Favaro, S., Trippa, L. 2016; 78 (4): 933-946

    View details for DOI 10.1111/rssb.12140

    View details for Web of Science ID 000380720300011

  • Transition path theory analysis of c-Src kinase activation PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Meng, Y., Shukla, D., Pande, V. S., Roux, B. 2016; 113 (33): 9193-9198


    Nonreceptor tyrosine kinases of the Src family are large multidomain allosteric proteins that are crucial to cellular signaling pathways. In a previous study, we generated a Markov state model (MSM) to simulate the activation of c-Src catalytic domain, used as a prototypical tyrosine kinase. The long-time kinetics of transition predicted by the MSM was in agreement with experimental observations. In the present study, we apply the framework of transition path theory (TPT) to the previously constructed MSM to characterize the main features of the activation pathway. The analysis indicates that the activating transition, in which the activation loop first opens up followed by an inward rotation of the αC-helix, takes place via a dense set of intermediate microstates distributed within a fairly broad "transition tube" in a multidimensional conformational subspace connecting the two end-point conformations. Multiple microstates with negligible equilibrium probabilities carry a large transition flux associated with the activating transition, which explains why extensive conformational sampling is necessary to accurately determine the kinetics of activation. Our results suggest that the combination of MSM with TPT provides an effective framework to represent conformational transitions in complex biomolecular systems.

    View details for DOI 10.1073/pnas.1602790113

    View details for Web of Science ID 000381399200040

    View details for PubMedID 27482115

  • Finding Our Way in the Dark Proteome. Journal of the American Chemical Society Bhowmick, A., Brookes, D. H., Yost, S. R., Dyson, H. J., Forman-Kay, J. D., Gunter, D., Head-Gordon, M., Hura, G. L., Pande, V. S., Wemmer, D. E., Wright, P. E., Head-Gordon, T. 2016; 138 (31): 9730-9742


    The traditional structure-function paradigm has provided significant insights for well-folded proteins in which structures can be easily and rapidly revealed by X-ray crystallography beamlines. However, approximately one-third of the human proteome is comprised of intrinsically disordered proteins and regions (IDPs/IDRs) that do not adopt a dominant well-folded structure, and therefore remain "unseen" by traditional structural biology methods. This Perspective considers the challenges raised by the "Dark Proteome", in which determining the diverse conformational substates of IDPs in their free states, in encounter complexes of bound states, and in complexes retaining significant disorder requires an unprecedented level of integration of multiple and complementary solution-based experiments that are analyzed with state-of-the art molecular simulation, Bayesian probabilistic models, and high-throughput computation. We envision how these diverse experimental and computational tools can work together through formation of a "computational beamline" that will allow key functional features to be identified in IDP structural ensembles.

    View details for DOI 10.1021/jacs.6b06543

    View details for PubMedID 27387657

  • Molecular graph convolutions: moving beyond fingerprints. Journal of computer-aided molecular design Kearnes, S., McCloskey, K., Berndl, M., Pande, V., Riley, P. 2016; 30 (8): 595-608


    Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph-atoms, bonds, distances, etc.-which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.

    View details for DOI 10.1007/s10822-016-9938-8

    View details for PubMedID 27558503

  • ROCS-derived features for virtual screening. Journal of computer-aided molecular design Kearnes, S., Pande, V. 2016; 30 (8): 609-617


    Rapid overlay of chemical structures (ROCS) is a standard tool for the calculation of 3D shape and chemical ("color") similarity. ROCS uses unweighted sums to combine many aspects of similarity, yielding parameter-free models for virtual screening. In this report, we decompose the ROCS color force field into color components and color atom overlaps, novel color similarity features that can be weighted in a system-specific manner by machine learning algorithms. In cross-validation experiments, these additional features significantly improve virtual screening performance relative to standard ROCS.

    View details for DOI 10.1007/s10822-016-9959-3

    View details for PubMedID 27624668

  • Discovery of a regioselectivity switch in nitrating P450s guided by molecular dynamics simulations and Markov models NATURE CHEMISTRY Dodani, S. C., Kiss, G., Cahn, J. K., Su, Y., Pande, V. S., Arnold, F. H. 2016; 8 (5): 419-425


    The dynamic motions of protein structural elements, particularly flexible loops, are intimately linked with diverse aspects of enzyme catalysis. Engineering of these loop regions can alter protein stability, substrate binding and even dramatically impact enzyme function. When these flexible regions are unresolvable structurally, computational reconstruction in combination with large-scale molecular dynamics simulations can be used to guide the engineering strategy. Here we present a collaborative approach that consists of both experiment and computation and led to the discovery of a single mutation in the F/G loop of the nitrating cytochrome P450 TxtE that simultaneously controls loop dynamics and completely shifts the enzyme's regioselectivity from the C4 to the C5 position of L-tryptophan. Furthermore, we find that this loop mutation is naturally present in a subset of homologous nitrating P450s and confirm that these uncharacterized enzymes exclusively produce 5-nitro-L-tryptophan, a previously unknown biosynthetic intermediate.

    View details for DOI 10.1038/NCHEM.2474

    View details for Web of Science ID 000374534100008

    View details for PubMedID 27102675

  • Markov State Models and tICA Reveal a Nonnative Folding Nucleus in Simulations of NuG2 BIOPHYSICAL JOURNAL Schwantes, C. R., Shukla, D., Pande, V. S. 2016; 110 (8): 1716-1719


    After reanalyzing simulations of NuG2-a designed mutant of protein G-generated by Lindorff-Larsen et al. with time structure-based independent components analysis and Markov state models as well as performing 1.5 ms of additional sampling on Folding@home, we found an intermediate with a register-shift in one of the β-sheets that was visited along a minor folding pathway. The minor folding pathway was initiated by the register-shifted sheet, which is composed of solely nonnative contacts, suggesting that for some peptides, nonnative contacts can lead to productive folding events. To confirm this experimentally, we suggest a mutational strategy for stabilizing the register shift, as well as an infrared experiment that could observe the nonnative folding nucleus.

    View details for DOI 10.1016/j.bpj.2016.03.026

    View details for Web of Science ID 000374859600006

    View details for PubMedID 27119632

  • Conformational heterogeneity of the calmodulin binding interface NATURE COMMUNICATIONS Shukla, D., Peck, A., Pande, V. S. 2016; 7


    Calmodulin (CaM) is a ubiquitous Ca(2+) sensor and a crucial signalling hub in many pathways aberrantly activated in disease. However, the mechanistic basis of its ability to bind diverse signalling molecules including G-protein-coupled receptors, ion channels and kinases remains poorly understood. Here we harness the high resolution of molecular dynamics simulations and the analytical power of Markov state models to dissect the molecular underpinnings of CaM binding diversity. Our computational model indicates that in the absence of Ca(2+), sub-states in the folded ensemble of CaM's C-terminal domain present chemically and sterically distinct topologies that may facilitate conformational selection. Furthermore, we find that local unfolding is off-pathway for the exchange process relevant for peptide binding, in contrast to prior hypotheses that unfolding might account for binding diversity. Finally, our model predicts a novel binding interface that is well-populated in the Ca(2+)-bound regime and, thus, a candidate for pharmacological intervention.

    View details for DOI 10.1038/ncomms10910

    View details for Web of Science ID 000373529600001

    View details for PubMedID 27040077

    View details for PubMedCentralID PMC4822001

  • Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations. Nature genetics Araya, C. L., Cenik, C., Reuter, J. A., Kiss, G., Pande, V. S., Snyder, M. P., Greenleaf, W. J. 2016; 48 (2): 117-125


    Cancer sequencing studies have primarily identified cancer driver genes by the accumulation of protein-altering mutations. An improved method would be annotation independent, sensitive to unknown distributions of functions within proteins and inclusive of noncoding drivers. We employed density-based clustering methods in 21 tumor types to detect variably sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and noncoding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs demonstrate spatial clustering of alterations in molecular domains and at interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated across tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest that mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally agnostic driver identification.

    View details for DOI 10.1038/ng.3471

    View details for PubMedID 26691984

  • Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations NATURE GENETICS Araya, C. L., Cenik, C., Reuters, J. A., Kiss, G., Pande, V. S., Snyder, M. P., Greenleaf, W. J. 2016; 48 (2): 117-125

    View details for DOI 10.1038/ng.3471

    View details for Web of Science ID 000369043900008

  • Automated Discovery and Refinement of Reactive Molecular Dynamics Pathways JOURNAL OF CHEMICAL THEORY AND COMPUTATION Wang, L., McGibbon, R. T., Pande, V. S., Martinez, T. J. 2016; 12 (2): 638-649


    We describe a flexible and broadly applicable energy refinement method, "nebterpolation," for identifying and characterizing the reaction events in a molecular dynamics (MD) simulation. The new method is applicable to ab initio simulations with hundreds of atoms containing complex and multimolecular reaction events. A key aspect of nebterpolation is smoothing of the reactive MD trajectory in internal coordinates to initiate the search for the reaction path on the potential energy surface. We apply nebterpolation to analyze the reaction events in an ab initio nanoreactor simulation that discovers new molecules and mechanisms, including a C-C coupling pathway for glycolaldehyde synthesis. We find that the new method, which incorporates information from the MD trajectory that connects reactants with products, produces a dramatically distinct set of minimum energy paths compared to existing approaches that start from information for the reaction end points alone. The energy refinement method described here represents a key component of an emerging simulation paradigm where molecular dynamics simulations are applied to discover the possible reaction mechanisms.

    View details for DOI 10.1021/acs.jctc.5b00830

    View details for Web of Science ID 000370112900018

    View details for PubMedID 26683346

  • Corrigendum: Conformational heterogeneity of the calmodulin binding interface. Nature communications Shukla, D., Peck, A., Pande, V. S. 2016; 7: 12318-?

    View details for DOI 10.1038/ncomms12318

    View details for PubMedID 27506931

  • CHARMM-GUI Input Generator for NAMD, GROMACS, AMBER, OpenMM, and CHARMM/OpenMM Simulations Using the CHARMM36 Additive Force Field JOURNAL OF CHEMICAL THEORY AND COMPUTATION Lee, J., Cheng, X., Swails, J. M., Yeom, M. S., Eastman, P. K., Lemkul, J. A., Wei, S., Buckner, J., Jeong, J. C., Qi, Y., Jo, S., Pande, V. S., Case, D. A., Brooks, C. L., MacKerell, A. D., Klauda, J. B., Im, W. 2016; 12 (1): 405-413
  • MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophysical journal McGibbon, R. T., Beauchamp, K. A., Harrigan, M. P., Klein, C., Swails, J. M., Hernández, C. X., Schwantes, C. R., Wang, L., Lane, T. J., Pande, V. S. 2015; 109 (8): 1528-1532

    View details for DOI 10.1016/j.bpj.2015.08.015

    View details for PubMedID 26488642

  • Corrigendum: Cloud-based simulations on Google Exacycle reveal ligand modulation of GPCR activation pathways. Nature chemistry Kohlhoff, K. J., Shukla, D., Lawrenz, M., Bowman, G. R., Konerding, D. E., Belov, D., Altman, R. B., Pande, V. S. 2015; 7 (9): 759-?

    View details for DOI 10.1038/nchem.2272

    View details for PubMedID 26291949

  • Heat dissipation guides activation in signaling proteins. Proceedings of the National Academy of Sciences of the United States of America Weber, J. K., Shukla, D., Pande, V. S. 2015; 112 (33): 10377-10382


    Life is fundamentally a nonequilibrium phenomenon. At the expense of dissipated energy, living things perform irreversible processes that allow them to propagate and reproduce. Within cells, evolution has designed nanoscale machines to do meaningful work with energy harnessed from a continuous flux of heat and particles. As dictated by the Second Law of Thermodynamics and its fluctuation theorem corollaries, irreversibility in nonequilibrium processes can be quantified in terms of how much entropy such dynamics produce. In this work, we seek to address a fundamental question linking biology and nonequilibrium physics: can the evolved dissipative pathways that facilitate biomolecular function be identified by their extent of entropy production in general relaxation processes? We here synthesize massive molecular dynamics simulations, Markov state models (MSMs), and nonequilibrium statistical mechanical theory to probe dissipation in two key classes of signaling proteins: kinases and G-protein-coupled receptors (GPCRs). Applying machinery from large deviation theory, we use MSMs constructed from protein simulations to generate dynamics conforming to positive levels of entropy production. We note the emergence of an array of peaks in the dynamical response (transient analogs of phase transitions) that draw the proteins between distinct levels of dissipation, and we see that the binding of ATP and agonist molecules modifies the observed dissipative landscapes. Overall, we find that dissipation is tightly coupled to activation in these signaling systems: dominant entropy-producing trajectories become localized near important barriers along known biological activation pathways. We go on to classify an array of equilibrium and nonequilibrium molecular switches that harmonize to promote functional dynamics.

    View details for DOI 10.1073/pnas.1501804112

    View details for PubMedID 26240354

  • Heat dissipation guides activation in signaling proteins PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Weber, J. K., Shukla, D., Pande, V. S. 2015; 112 (33): 10377-10382
  • Revised Parameters for the AMOEBA Polarizable Atomic Multipole Water Model JOURNAL OF PHYSICAL CHEMISTRY B Laury, M. L., Wang, L., Pande, V. S., Head-Gordon, T., Ponder, J. W. 2015; 119 (29): 9423-9437

    View details for DOI 10.1021/jp510896n

    View details for Web of Science ID 000358623900064

  • Revised Parameters for the AMOEBA Polarizable Atomic Multipole Water Model. journal of physical chemistry. B Laury, M. L., Wang, L., Pande, V. S., Head-Gordon, T., Ponder, J. W. 2015; 119 (29): 9423-9437


    A set of improved parameters for the AMOEBA polarizable atomic multipole water model is developed. An automated procedure, ForceBalance, is used to adjust model parameters to enforce agreement with ab initio-derived results for water clusters and experimental data for a variety of liquid phase properties across a broad temperature range. The values reported here for the new AMOEBA14 water model represent a substantial improvement over the previous AMOEBA03 model. The AMOEBA14 model accurately predicts the temperature of maximum density and qualitatively matches the experimental density curve across temperatures from 249 to 373 K. Excellent agreement is observed for the AMOEBA14 model in comparison to experimental properties as a function of temperature, including the second virial coefficient, enthalpy of vaporization, isothermal compressibility, thermal expansion coefficient, and dielectric constant. The viscosity, self-diffusion constant, and surface tension are also well reproduced. In comparison to high-level ab initio results for clusters of 2-20 water molecules, the AMOEBA14 model yields results similar to AMOEBA03 and the direct polarization iAMOEBA models. With advances in computing power, calibration data, and optimization techniques, we recommend the use of the AMOEBA14 water model for future studies employing a polarizable water model.

    View details for DOI 10.1021/jp510896n

    View details for PubMedID 25683601

  • Efficient maximum likelihood parameterization of continuous-time Markov processes. journal of chemical physics McGibbon, R. T., Pande, V. S. 2015; 143 (3): 034109-?


    Continuous-time Markov processes over finite state-spaces are widely used to model dynamical processes in many fields of natural and social science. Here, we introduce a maximum likelihood estimator for constructing such models from data observed at a finite time interval. This estimator is dramatically more efficient than prior approaches, enables the calculation of deterministic confidence intervals in all model parameters, and can easily enforce important physical constraints on the models such as detailed balance. We demonstrate and discuss the advantages of these models over existing discrete-time Markov models for the analysis of molecular dynamics simulations.

    View details for DOI 10.1063/1.4926516

    View details for PubMedID 26203016

  • Efficient maximum likelihood parameterization of continuous-time Markov processes JOURNAL OF CHEMICAL PHYSICS McGibbon, R. T., Pande, V. S. 2015; 143 (3)

    View details for DOI 10.1063/1.4926516

    View details for Web of Science ID 000358429800009

  • United polarizable multipole water model for molecular mechanics simulation JOURNAL OF CHEMICAL PHYSICS Qi, R., Wang, L., Wang, Q., Pande, V. S., Ren, P. 2015; 143 (1)


    We report the development of a united AMOEBA (uAMOEBA) polarizable water model, which is computationally 3-5 times more efficient than the three-site AMOEBA03 model in molecular dynamics simulations while providing comparable accuracy for gas-phase and liquid properties. In this coarse-grained polarizable water model, both electrostatic (permanent and induced) and van der Waals representations have been reduced to a single site located at the oxygen atom. The permanent charge distribution is described via the molecular dipole and quadrupole moments and the many-body polarization via an isotropic molecular polarizability, all located at the oxygen center. Similarly, a single van der Waals interaction site is used for each water molecule. Hydrogen atoms are retained only for the purpose of defining local frames for the molecular multipole moments and intramolecular vibrational modes. The parameters have been derived based on a combination of ab initio quantum mechanical and experimental data set containing gas-phase cluster structures and energies, and liquid thermodynamic properties. For validation, additional properties including dimer interaction energy, liquid structures, self-diffusion coefficient, and shear viscosity have been evaluated. The results demonstrate good transferability from the gas to the liquid phase over a wide range of temperatures, and from nonpolar to polar environments, due to the presence of molecular polarizability. The water coordination, hydrogen-bonding structure, and dynamic properties given by uAMOEBA are similar to those derived from the all-atom AMOEBA03 model and experiments. Thus, the current model is an accurate and efficient alternative for modeling water.

    View details for DOI 10.1063/1.4923338

    View details for Web of Science ID 000357873900031

    View details for PubMedID 26156485

  • OpenMM: A Hardware Independent Framework for Molecular Simulations. Computing in science & engineering Eastman, P., Pande, V. S. 2015; 12 (4): 34-39


    The wide diversity of computer architectures today requires a new approach to software development. OpenMM is a framework for molecular mechanics simulations, allowing a single program to run efficiently on a variety of hardware platforms.

    View details for PubMedID 26146490

  • Percolation-like phase transitions in network models of protein dynamics JOURNAL OF CHEMICAL PHYSICS Weber, J. K., Pande, V. S. 2015; 142 (21)


    In broad terms, percolation theory describes the conditions under which clusters of nodes are fully connected in a random network. A percolation phase transition occurs when, as edges are added to a network, its largest connected cluster abruptly jumps from insignificance to complete dominance. In this article, we apply percolation theory to meticulously constructed networks of protein folding dynamics called Markov state models. As rare fluctuations are systematically repressed (or reintroduced), we observe percolation-like phase transitions in protein folding networks: whole sets of conformational states switch from nearly complete isolation to complete connectivity in a rapid fashion. We analyze the general and critical properties of these phase transitions in seven protein systems and discuss how closely dynamics on protein folding landscapes relate to percolation on random lattices.

    View details for DOI 10.1063/1.4921989

    View details for Web of Science ID 000355931800113

    View details for PubMedID 26049529

  • Potential-Based Dynamical Reweighting for Markov State Models of Protein Dynamics JOURNAL OF CHEMICAL THEORY AND COMPUTATION Weber, J. K., Pande, V. S. 2015; 11 (6): 2412-2420
  • A network of molecular switches controls the activation of the two-component response regulator NtrC NATURE COMMUNICATIONS Vanatta, D. K., Shukla, D., Lawrenz, M., Pande, V. S. 2015; 6


    Recent successes in simulating protein structure and folding dynamics have demonstrated the power of molecular dynamics to predict the long timescale behaviour of proteins. Here, we extend and improve these methods to predict molecular switches that characterize conformational change pathways between the active and inactive state of nitrogen regulatory protein C (NtrC). By employing unbiased Markov state model-based molecular dynamics simulations, we construct a dynamic picture of the activation pathways of this key bacterial signalling protein that is consistent with experimental observations and predicts new mutants that could be used for validation of the mechanism. Moreover, these results suggest a novel mechanistic paradigm for conformational switching.

    View details for DOI 10.1038/ncomms8283

    View details for Web of Science ID 000357170800010

    View details for PubMedID 26073186

  • The Dynamic Conformational Cycle of the Group I Chaperonin C-Termini Revealed via Molecular Dynamics Simulation PLOS ONE Dalton, K. M., Frydman, J., Pande, V. S. 2015; 10 (3)

    View details for DOI 10.1371/journal.pone.0117724

    View details for Web of Science ID 000352134700014

    View details for PubMedID 25822285

  • Variational cross-validation of slow dynamical modes in molecular kinetics. journal of chemical physics McGibbon, R. T., Pande, V. S. 2015; 142 (12): 124105-?


    Markov state models are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choosing between alternative models has hampered progress, especially for non-experts applying these methods to novel biological systems. Here, we consider cross-validation with a new objective function for estimators of these slow dynamical modes, a generalized matrix Rayleigh quotient (GMRQ), which measures the ability of a rank-m projection operator to capture the slow subspace of the system. It is shown that a variational theorem bounds the GMRQ from above by the sum of the first m eigenvalues of the system's propagator, but that this bound can be violated when the requisite matrix elements are estimated subject to statistical uncertainty. This overfitting can be detected and avoided through cross-validation. These result make it possible to construct Markov state models for protein dynamics in a way that appropriately captures the tradeoff between systematic and statistical errors.

    View details for DOI 10.1063/1.4916292

    View details for PubMedID 25833563

  • Variational cross-validation of slow dynamical modes in molecular kinetics JOURNAL OF CHEMICAL PHYSICS McGibbon, R. T., Pande, V. S. 2015; 142 (12)

    View details for DOI 10.1063/1.4916292

    View details for Web of Science ID 000352316700007

    View details for PubMedID 25833563

  • Entropy-production-driven oscillators in simple nonequilibrium networks PHYSICAL REVIEW E Weber, J. K., Pande, V. S. 2015; 91 (3)


    The development of tractable nonequilibrium simulation methods represents a bottleneck for efforts to describe the functional dynamics that occur within living cells. We here employ a nonequilibrium approach called the λ ensemble to characterize the dissipative dynamics of a simple Markovian network driven by an external potential. In the highly dissipative regime brought about by the λ bias, we observe a dynamical structure characteristic of cellular architectures: The entropy production drives a damped oscillator over state populations in the network. We illustrate the properties of such oscillations in weakly and strongly driven regimes, and we discuss how control structures associated with the "dynamical phase transition" in the system can be related to switches and oscillators in cellular dynamics.

    View details for DOI 10.1103/PhysRevE.91.032136

    View details for Web of Science ID 000353915500005

    View details for PubMedID 25871083

  • Conserve Water: A Method for the Analysis of Solvent in Molecular Dynamics JOURNAL OF CHEMICAL THEORY AND COMPUTATION Harrigan, M. P., Shukla, D., Pande, V. S. 2015; 11 (3): 1094-1101

    View details for DOI 10.1021/ct5010017

    View details for Web of Science ID 000350918300027

  • Modeling Molecular Kinetics with tICA and the Kernel Trick JOURNAL OF CHEMICAL THEORY AND COMPUTATION Schwantes, C. R., Pande, V. S. 2015; 11 (2): 600-608

    View details for DOI 10.1021/ct5007357

    View details for Web of Science ID 000349934400022

  • Markov State Models Provide Insights into Dynamic Modulation of Protein Function ACCOUNTS OF CHEMICAL RESEARCH Shukla, D., Hernandez, C. X., Weber, J. K., Pande, V. S. 2015; 48 (2): 414-422


    CONSPECTUS: Protein function is inextricably linked to protein dynamics. As we move from a static structural picture to a dynamic ensemble view of protein structure and function, novel computational paradigms are required for observing and understanding conformational dynamics of proteins and its functional implications. In principle, molecular dynamics simulations can provide the time evolution of atomistic models of proteins, but the long time scales associated with functional dynamics make it difficult to observe rare dynamical transitions. The issue of extracting essential functional components of protein dynamics from noisy simulation data presents another set of challenges in obtaining an unbiased understanding of protein motions. Therefore, a methodology that provides a statistical framework for efficient sampling and a human-readable view of the key aspects of functional dynamics from data analysis is required. The Markov state model (MSM), which has recently become popular worldwide for studying protein dynamics, is an example of such a framework. In this Account, we review the use of Markov state models for efficient sampling of the hierarchy of time scales associated with protein dynamics, automatic identification of key conformational states, and the degrees of freedom associated with slow dynamical processes. Applications of MSMs for studying long time scale phenomena such as activation mechanisms of cellular signaling proteins has yielded novel insights into protein function. In particular, from MSMs built using large-scale simulations of GPCRs and kinases, we have shown that complex conformational changes in proteins can be described in terms of structural changes in key structural motifs or "molecular switches" within the protein, the transitions between functionally active and inactive states of proteins proceed via multiple pathways, and ligand or substrate binding modulates the flux through these pathways. Finally, MSMs also provide a theoretical toolbox for studying the effect of nonequilibrium perturbations on conformational dynamics. Considering that protein dynamics in vivo occur under nonequilibrium conditions, MSMs coupled with nonequilibrium statistical mechanics provide a way to connect cellular components to their functional environments. Nonequilibrium perturbations of protein folding MSMs reveal the presence of dynamically frozen glass-like states in their conformational landscape. These frozen states are also observed to be rich in β-sheets, which indicates their possible role in the nucleation of β-sheet rich aggregates such as those observed in amyloid-fibril formation. Finally, we describe how MSMs have been used to understand the dynamical behavior of intrinsically disordered proteins such as amyloid-β, human islet amyloid polypeptide, and p53. While certainly not a panacea for studying functional dynamics, MSMs provide a rigorous theoretical foundation for understanding complex entropically dominated processes and a convenient lens for viewing protein motions.

    View details for DOI 10.1021/ar5002999

    View details for Web of Science ID 000349806300028

    View details for PubMedID 25625937

    View details for PubMedCentralID PMC4333613

  • Cloud computing approaches for prediction of ligand binding poses and pathways SCIENTIFIC REPORTS Lawrenz, M., Shukla, D., Pande, V. S. 2015; 5


    We describe an innovative protocol for ab initio prediction of ligand crystallographic binding poses and highly effective analysis of large datasets generated for protein-ligand dynamics. We include a procedure for setup and performance of distributed molecular dynamics simulations on cloud computing architectures, a model for efficient analysis of simulation data, and a metric for evaluation of model convergence. We give accurate binding pose predictions for five ligands ranging in affinity from 7 nM to > 200 μM for the immunophilin protein FKBP12, for expedited results in cases where experimental structures are difficult to produce. Our approach goes beyond single, low energy ligand poses to give quantitative kinetic information that can inform protein engineering and ligand design.

    View details for DOI 10.1038/srep07918

    View details for Web of Science ID 000348164000001

    View details for PubMedID 25608737

  • The dynamic conformational cycle of the group I chaperonin C-termini revealed via molecular dynamics simulation. PloS one Dalton, K. M., Frydman, J., Pande, V. S. 2015; 10 (3)


    Chaperonins are large ring shaped oligomers that facilitate protein folding by encapsulation within a central cavity. All chaperonins possess flexible C-termini which protrude from the equatorial domain of each subunit into the central cavity. Biochemical evidence suggests that the termini play an important role in the allosteric regulation of the ATPase cycle, in substrate folding and in complex assembly and stability. Despite the tremendous wealth of structural data available for numerous orthologous chaperonins, little structural information is available regarding the residues within the C-terminus. Herein, molecular dynamics simulations are presented which localize the termini throughout the nucleotide cycle of the group I chaperonin, GroE, from Escherichia coli. The simulation results predict that the termini undergo a heretofore unappreciated conformational cycle which is coupled to the nucleotide state of the enzyme. As such, these results have profound implications for the mechanism by which GroE utilizes nucleotide and folds client proteins.

    View details for DOI 10.1371/journal.pone.0117724

    View details for PubMedID 25822285

  • Elucidating Ligand-Modulated Conformational Landscape of GPCRs Using Cloud-Computing Approaches. Methods in enzymology Shukla, D., Lawrenz, M., Pande, V. S. 2015; 557: 551-572


    G-protein-coupled receptors (GPCRs) are a versatile family of membrane-bound signaling proteins. Despite the recent successes in obtaining crystal structures of GPCRs, much needs to be learned about the conformational changes associated with their activation. Furthermore, the mechanism by which ligands modulate the activation of GPCRs has remained elusive. Molecular simulations provide a way of obtaining detailed an atomistic description of GPCR activation dynamics. However, simulating GPCR activation is challenging due to the long timescales involved and the associated challenge of gaining insights from the "Big" simulation datasets. Here, we demonstrate how cloud-computing approaches have been used to tackle these challenges and obtain insights into the activation mechanism of GPCRs. In particular, we review the use of Markov state model (MSM)-based sampling algorithms for sampling milliseconds of dynamics of a major drug target, the G-protein-coupled receptor β2-AR. MSMs of agonist and inverse agonist-bound β2-AR reveal multiple activation pathways and how ligands function via modulation of the ensemble of activation pathways. We target this ensemble of conformations with computer-aided drug design approaches, with the goal of designing drugs that interact more closely with diverse receptor states, for overall increased efficacy and specificity. We conclude by discussing how cloud-based approaches present a powerful and broadly available tool for studying the complex biological systems routinely.

    View details for DOI 10.1016/bs.mie.2014.12.007

    View details for PubMedID 25950981

  • Automatic Selection of Order Parameters in the Analysis of Large Scale Molecular Dynamics Simulations JOURNAL OF CHEMICAL THEORY AND COMPUTATION Sultan, M. M., Kiss, G., Shukla, D., Pande, V. S. 2014; 10 (12): 5217-5223

    View details for DOI 10.1021/ct500353m

    View details for Web of Science ID 000346324000001

  • Discovering chemistry with an ab initio nanoreactor NATURE CHEMISTRY Wang, L., Titov, A., McGibbon, R., Liu, F., Pande, V. S., Martinez, T. J. 2014; 6 (12): 1044-1048

    View details for DOI 10.1038/NCHEM.2099

    View details for Web of Science ID 000345429200008

  • Unravelling the distinct strains of Tharu ancestry EUROPEAN JOURNAL OF HUMAN GENETICS Chaubey, G., Singh, M., Crivellaro, F., Tamang, R., Nandan, A., Singh, K., Sharma, V. K., Pathak, A. K., Shah, A. M., Sharma, V., Singh, V. K., Rani, D. S., Rai, N., Kushniarevich, A., Ilumaee, A., Karmin, M., Phillip, A., Verma, A., Prank, E., Singh, V. K., Li, B., Govindaraj, P., Chaubey, A. K., Dubey, P. K., Reddy, A. G., Premkumar, K., Vishnupriya, S., Pande, V., Parik, J., Rootsi, S., Endicott, P., Metspalu, M., Lahr, M. M., van Driem, G., Villems, R., Kivisild, T., Singh, L., Thangaraj, K. 2014; 22 (12): 1404-1412


    The northern region of the Indian subcontinent is a vast landscape interlaced by diverse ecologies, for example, the Gangetic Plain and the Himalayas. A great number of ethnic groups are found there, displaying a multitude of languages and cultures. The Tharu is one of the largest and most linguistically diverse of such groups, scattered across the Tarai region of Nepal and bordering Indian states. Their origins are uncertain. Hypotheses have been advanced postulating shared ancestry with Austroasiatic, or Tibeto-Burman-speaking populations as well as aboriginal roots in the Tarai. Several Tharu groups speak a variety of Indo-Aryan languages, but have traditionally been described by ethnographers as representing East Asian phenotype. Their ancestry and intra-population diversity has previously been tested only for haploid (mitochondrial DNA and Y-chromosome) markers in a small portion of the population. This study presents the first systematic genetic survey of the Tharu from both Nepal and two Indian states of Uttarakhand and Uttar Pradesh, using genome-wide SNPs and haploid markers. We show that the Tharu have dual genetic ancestry as up to one-half of their gene pool is of East Asian origin. Within the South Asian proportion of the Tharu genetic ancestry, we see vestiges of their common origin in the north of the South Asian Subcontinent manifested by mitochondrial DNA haplogroup M43.

    View details for DOI 10.1038/ejhg.2014.36

    View details for Web of Science ID 000345130300012

    View details for PubMedID 24667789

  • Perspective: Markov models for long-timescale biomolecular dynamics JOURNAL OF CHEMICAL PHYSICS Schwantes, C. R., McGibbon, R. T., Pande, V. S. 2014; 141 (9)

    View details for DOI 10.1063/1.4895044

    View details for Web of Science ID 000342207400001

  • Perspective: Markov models for long-timescale biomolecular dynamics. journal of chemical physics Schwantes, C. R., McGibbon, R. T., Pande, V. S. 2014; 141 (9): 090901-?


    Molecular dynamics simulations have the potential to provide atomic-level detail and insight to important questions in chemical physics that cannot be observed in typical experiments. However, simply generating a long trajectory is insufficient, as researchers must be able to transform the data in a simulation trajectory into specific scientific insights. Although this analysis step has often been taken for granted, it deserves further attention as large-scale simulations become increasingly routine. In this perspective, we discuss the application of Markov models to the analysis of large-scale biomolecular simulations. We draw attention to recent improvements in the construction of these models as well as several important open issues. In addition, we highlight recent theoretical advances that pave the way for a new generation of models of molecular kinetics.

    View details for DOI 10.1063/1.4895044

    View details for PubMedID 25194354

  • Complex pathways in folding of protein g explored by simulation and experiment. Biophysical journal Lapidus, L. J., Acharya, S., Schwantes, C. R., Wu, L., Shukla, D., King, M., DeCamp, S. J., Pande, V. S. 2014; 107 (4): 947-955


    The B1 domain of protein G has been a classic model system of folding for decades, the subject of numerous experimental and computational studies. Most of the experimental work has focused on whether the protein folds via an intermediate, but the evidence is mostly limited to relatively slow kinetic observations with a few structural probes. In this work we observe folding on the submillisecond timescale with microfluidic mixers using a variety of probes including tryptophan fluorescence, circular dichroism, and photochemical oxidation. We find that each probe yields different kinetics and compare these observations with a Markov State Model constructed from large-scale molecular dynamics simulations and find a complex network of states that yield different kinetics for different observables. We conclude that there are many folding pathways before the final folding step and that these paths do not have large free energy barriers.

    View details for DOI 10.1016/j.bpj.2014.06.037

    View details for PubMedID 25140430

  • Dynamical Phase Transitions Reveal Amyloid-like States on Protein Folding Landscapes. Biophysical journal Weber, J. K., Jack, R. L., Schwantes, C. R., Pande, V. S. 2014; 107 (4): 974-982


    Developing an understanding of protein misfolding processes presents a crucial challenge for unlocking the mysteries of human disease. In this article, we present our observations of β-sheet-rich misfolded states on a number of protein dynamical landscapes investigated through molecular dynamics simulation and Markov state models. We employ a nonequilibrium statistical mechanical theory to identify the glassy states in a protein's dynamics, and we discuss the nonnative, β-sheet-rich states that play a distinct role in the slowest dynamics within seven protein folding systems. We highlight the fundamental similarity between these states and the amyloid structures responsible for many neurodegenerative diseases, and we discuss potential consequences for mechanisms of protein aggregation and intermolecular amyloid formation.

    View details for DOI 10.1016/j.bpj.2014.06.046

    View details for PubMedID 25140433

  • Observation of correlated X-ray scattering at atomic resolution. Philosophical transactions of the Royal Society of London. Series B, Biological sciences Mendez, D., Lane, T. J., Sung, J., Sellberg, J., Levard, C., Watkins, H., Cohen, A. E., Soltis, M., Sutton, S., Spudich, J., Pande, V., Ratner, D., Doniach, S. 2014; 369 (1647)

    View details for DOI 10.1098/rstb.2013.0315

    View details for PubMedID 24914148

  • Observation of correlated X-ray scattering at atomic resolution. Philosophical transactions of the Royal Society of London. Series B, Biological sciences Mendez, D., Lane, T. J., Sung, J., Sellberg, J., Levard, C., Watkins, H., Cohen, A. E., Soltis, M., Sutton, S., Spudich, J., Pande, V., Ratner, D., Doniach, S. 2014; 369 (1647)


    Tools to study disordered systems with local structural order, such as proteins in solution, remain limited. Such understanding is essential for e.g. rational drug design. Correlated X-ray scattering (CXS) has recently attracted new interest as a way to leverage next-generation light sources to study such disordered matter. The CXS experiment measures angular correlations of the intensity caused by the scattering of X-rays from an ensemble of identical particles, with disordered orientation and position. Averaging over 15 496 snapshot images obtained by exposing a sample of silver nanoparticles in solution to a micro-focused synchrotron radiation beam, we report on experimental efforts to obtain CXS signal from an ensemble in three dimensions. A correlation function was measured at wide angles corresponding to atomic resolution that matches theoretical predictions. These preliminary results suggest that other CXS experiments on disordered ensembles-such as proteins in solution-may be feasible in the future.

    View details for DOI 10.1098/rstb.2013.0315

    View details for PubMedID 24914148

  • Statistical Model Selection for Markov Models of Biomolecular Dynamics JOURNAL OF PHYSICAL CHEMISTRY B McGibbon, R. T., Schwantes, C. R., Pande, V. S. 2014; 118 (24): 6475-6481

    View details for DOI 10.1021/jp411822r

    View details for Web of Science ID 000337784100014

  • Statistical model selection for Markov models of biomolecular dynamics. journal of physical chemistry. B McGibbon, R. T., Schwantes, C. R., Pande, V. S. 2014; 118 (24): 6475-6481


    Markov state models provide a powerful framework for the analysis of biomolecular conformation dynamics in terms of their metastable states and transition rates. These models provide both a quantitative and comprehensible description of the long-time scale dynamics of large molecular dynamics with a Master equation and have been successfully used to study protein folding, protein conformational change, and protein-ligand binding. However, to achieve satisfactory performance, existing methodologies often require expert intervention when defining the model's discrete state space. While standard model selection methodologies focus on the minimization of systematic bias and disregard statistical error, we show that by consideration of the states' conditional distribution over conformations, both sources of error can be balanced evenhandedly. Application of techniques that consider both systematic bias and statistical error on two 100 μs molecular dynamics trajectories of the Fip35 WW domain shows agreement with existing techniques based on self-consistency of the model's relaxation time scales with more suitable results in regimes in which those time scale-based techniques encourage overfitting. By removing the need for expert tuning, these methods should reduce modeling bias and lower the barriers to entry in Markov state model construction.

    View details for DOI 10.1021/jp411822r

    View details for PubMedID 24738580

  • Building Force Fields: An Automatic, Systematic, and Reproducible Approach JOURNAL OF PHYSICAL CHEMISTRY LETTERS Wang, L., Martinez, T. J., Pande, V. S. 2014; 5 (11): 1885-1891

    View details for DOI 10.1021/jz500737m

    View details for Web of Science ID 000337012500017

  • Bayesian energy landscape tilting: towards concordant models of molecular ensembles. Biophysical journal Beauchamp, K. A., Pande, V. S., Das, R. 2014; 106 (6): 1381-1390


    Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and (3)J measurements gives convergent values of the peptide's α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT's principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data.

    View details for DOI 10.1016/j.bpj.2014.02.009

    View details for PubMedID 24655513

  • A Molecular Interpretation of 2D IR Protein Folding Experiments with Markov State Models. Biophysical journal Baiz, C. R., Lin, Y., Peng, C. S., Beauchamp, K. A., Voelz, V. A., Pande, V. S., Tokmakoff, A. 2014; 106 (6): 1359-1370


    The folding mechanism of the N-terminal domain of ribosomal protein L9 (NTL91-39) is studied using temperature-jump (T-jump) amide I' two-dimensional infrared (2D IR) spectroscopy in combination with spectral simulations based on a Markov state model (MSM) built from millisecond-long molecular dynamics trajectories. The results provide evidence for a compact well-structured folded state and a heterogeneous fast-exchanging denatured state ensemble exhibiting residual secondary structure. The folding rate of 26.4 μs(-1) (at 80°C), extracted from the T-jump response of NTL91-39, compares favorably with the 18 μs(-1) obtained from the MSM. Structural decomposition of the MSM and analysis along the folding coordinate indicates that helix-formation nucleates the global folding. Simulated difference spectra, corresponding to the global folding transition of the MSM, are in qualitative agreement with measured T-jump 2D IR spectra. The experiments demonstrate the use of T-jump 2D IR spectroscopy as a valuable tool for studying protein folding, with direct connections to simulations. The results suggest that in addition to predicting the correct native structure and folding time constant, molecular dynamics simulations carried out with modern force fields provide an accurate description of folding mechanisms in small proteins.

    View details for DOI 10.1016/j.bpj.2014.02.008

    View details for PubMedID 24655511

  • SCISSORS: practical considerations. Journal of chemical information and modeling Kearnes, S. M., Haque, I. S., Pande, V. S. 2014; 54 (1): 5-15


    Molecular similarity has been effectively applied to many problems in cheminformatics and computational drug discovery, but modern methods can be prohibitively expensive for large-scale applications. The SCISSORS method rapidly approximates measures of pairwise molecular similarity such as ROCS and LINGO Tanimotos, acting as a filter to quickly reduce the size of a problem. We report an in-depth analysis of SCISSORS performance, including a mapping of the SCISSORS error distribution, benchmarking, and investigation of several algorithmic modifications. We show that SCISSORS can accurately predict multiconformer similarity and suggest a method for estimating optimal SCISSORS parameters in a data set-specific manner. These results are a useful resource for researchers seeking to incorporate SCISSORS into molecular similarity applications.

    View details for DOI 10.1021/ci400264f

    View details for PubMedID 24289274

  • Cloud-based simulations on Google Exacycle reveal ligand modulation of GPCR activation pathways NATURE CHEMISTRY Kohlhoff, K. J., Shukla, D., Lawrenz, M., Bowman, G. R., Konerding, D. E., Belov, D., Altman, R. B., Pande, V. S. 2014; 6 (1): 15-21


    Simulations can provide tremendous insight into the atomistic details of biological mechanisms, but micro- to millisecond timescales are historically only accessible on dedicated supercomputers. We demonstrate that cloud computing is a viable alternative that brings long-timescale processes within reach of a broader community. We used Google's Exacycle cloud-computing platform to simulate two milliseconds of dynamics of a major drug target, the G-protein-coupled receptor β2AR. Markov state models aggregate independent simulations into a single statistical model that is validated by previous computational and experimental results. Moreover, our models provide an atomistic description of the activation of a G-protein-coupled receptor and reveal multiple activation pathways. Agonists and inverse agonists interact differentially with these pathways, with profound implications for drug design.

    View details for DOI 10.1038/NCHEM.1821

    View details for Web of Science ID 000328951000007

    View details for PubMedID 24345941

  • Activation pathway of Src kinase reveals intermediate states as targets for drug design. Nature communications Shukla, D., Meng, Y., Roux, B., Pande, V. S. 2014; 5: 3397-?


    Unregulated activation of Src kinases leads to aberrant signalling, uncontrolled growth and differentiation of cancerous cells. Reaching a complete mechanistic understanding of large-scale conformational transformations underlying the activation of kinases could greatly help in the development of therapeutic drugs for the treatment of these pathologies. In principle, the nature of conformational transition could be modelled in silico via atomistic molecular dynamics simulations, although this is very challenging because of the long activation timescales. Here we employ a computational paradigm that couples transition pathway techniques and Markov state model-based massively distributed simulations for mapping the conformational landscape of c-src tyrosine kinase. The computations provide the thermodynamics and kinetics of kinase activation for the first time, and help identify key structural intermediates. Furthermore, the presence of a novel allosteric site in an intermediate state of c-src that could be potentially used for drug design is predicted.

    View details for DOI 10.1038/ncomms4397

    View details for PubMedID 24584478

  • SCISSORS: Practical Considerations JOURNAL OF CHEMICAL INFORMATION AND MODELING Kearnes, S. M., Haque, I. S., Pande, V. S. 2014; 54 (1): 5-15

    View details for DOI 10.1021/ci400264f

    View details for Web of Science ID 000330542800002

  • Understanding Protein Folding Using Markov State Models INTRODUCTION TO MARKOV STATE MODELS AND THEIR APPLICATION TO LONG TIMESCALE MOLECULAR SIMULATION Pande, V. S. 2014; 797: 101-106

    View details for DOI 10.1007/978-94-007-7606-7_8

    View details for Web of Science ID 000333763600008

    View details for PubMedID 24297278

  • Finite domain simulations with adaptive boundaries: Accurate potentials and nonequilibrium movesets JOURNAL OF CHEMICAL PHYSICS Wagoner, J. A., Pande, V. S. 2013; 139 (23)


    We extend the theory of hybrid explicit/implicit solvent models to include an explicit domain that grows and shrinks in response to a solute's evolving configuration. The goal of this model is to provide an appropriate but not excessive amount of solvent detail, and the inclusion of an adjustable boundary provides a significant computational advantage for solutes that explore a range of configurations. In addition to the theoretical development, a successful implementation of this method requires (1) an efficient moveset that propagates the boundary as a new coordinate of the system, and (2) an accurate continuum solvent model with parameters that are transferable to an explicit domain of any size. We address these challenges and develop boundary updates using Monte Carlo moves biased by nonequilibrium paths. We obtain the desired level of accuracy using a "decoupling interface" that we have previously shown to remove boundary artifacts common to hybrid solvent models. Using an uncharged, coarse-grained solvent model, we then study the efficiency of nonequilibrium paths that a simulation takes by quantifying the dissipation. In the spirit of optimization, we study this quantity over a range of simulation parameters.

    View details for DOI 10.1063/1.4848655

    View details for Web of Science ID 000329191300014

    View details for PubMedID 24359359

  • Calculations of the Electric Fields in Liquid Solutions JOURNAL OF PHYSICAL CHEMISTRY B Fried, S. D., Wang, L., Boxer, S. G., Ren, P., Pande, V. S. 2013; 117 (50): 16236-16248

    View details for DOI 10.1021/jp410720y

    View details for Web of Science ID 000328920600034

    View details for PubMedID 24304155

  • Inferring the Rate-Length Law of Protein Folding PLOS ONE Lane, T. J., Pande, V. S. 2013; 8 (12)


    We investigate the rate-length scaling law of protein folding, a key undetermined scaling law in the analytical theory of protein folding. Available data yield statistically significant evidence for the existence of a rate-length law capable of predicting folding times to within about two orders of magnitude (over 9 decades of variation). Unambiguous determination of the functional form of such a law could provide key mechanistic insight into folding. Four proposed laws from literature (power law, exponential, and two stretched exponentials) are tested against one another, and it is found that the power law best explains the data by a modest margin. We conclude that more data is necessary to unequivocally infer the rate-length law. Such data could be obtained through a small number of protein folding experiments on large protein domains.

    View details for DOI 10.1371/journal.pone.0078606

    View details for Web of Science ID 000328566100004

    View details for PubMedID 24339865

  • Accelerated Molecular Dynamics Simulations with the AMOEBA Polarizable Force Field on Graphics Processing Units JOURNAL OF CHEMICAL THEORY AND COMPUTATION Lindert, S., Bucher, D., Eastman, P., Pande, V., McCammon, J. A. 2013; 9 (11): 4684-4691

    View details for DOI 10.1021/ct400514p

    View details for Web of Science ID 000327044500003

  • SWEETLEAD: an In Silico Database of Approved Drugs, Regulated Chemicals, and Herbal Isolates for Computer-Aided Drug Discovery PLOS ONE Novick, P. A., Ortiz, O. F., Poelman, J., Abdulhay, A. Y., Pande, V. S. 2013; 8 (11)


    In the face of drastically rising drug discovery costs, strategies promising to reduce development timelines and expenditures are being pursued. Computer-aided virtual screening and repurposing approved drugs are two such strategies that have shown recent success. Herein, we report the creation of a highly-curated in silico database of chemical structures representing approved drugs, chemical isolates from traditional medicinal herbs, and regulated chemicals, termed the SWEETLEAD database. The motivation for SWEETLEAD stems from the observance of conflicting information in publicly available chemical databases and the lack of a highly curated database of chemical structures for the globally approved drugs. A consensus building scheme surveying information from several publicly accessible databases was employed to identify the correct structure for each chemical. Resulting structures are filtered for the active pharmaceutical ingredient, standardized, and differing formulations of the same drug were combined in the final database. The publically available release of SWEETLEAD ( provides an important tool to enable the successful completion of computer-aided repurposing and drug discovery campaigns.

    View details for DOI 10.1371/journal.pone.0079568

    View details for Web of Science ID 000326499300091

    View details for PubMedID 24223973

  • Probing the origins of two-state folding JOURNAL OF CHEMICAL PHYSICS Lane, T. J., Schwantes, C. R., Beauchamp, K. A., Pande, V. S. 2013; 139 (14)


    Many protein systems fold in a two-state manner. Random models, however, rarely display two-state kinetics and thus such behavior should not be accepted as a default. While theories for the prevalence of two-state kinetics have been presented, none sufficiently explain the breadth of experimental observations. A model, making minimal assumptions, is introduced that suggests two-state behavior is likely for any system with an overwhelmingly populated native state. We show two-state folding is a natural consequence of such two-state thermodynamics, and is strengthened by increasing the population of the native state. Further, the model exhibits hub-like behavior, with slow interconversions between unfolded states. Despite this, the unfolded state equilibrates quickly relative to the folding time. This apparent paradox is readily understood through this model. Finally, our results compare favorable with measurements of folding rates as a function of chain length and Keq, providing new insight into these relations.

    View details for DOI 10.1063/1.4823502

    View details for Web of Science ID 000325780800057

    View details for PubMedID 24116650

  • Inclusion of persistence length-based secondary structure in replica field theoretic models of heteropolymer freezing JOURNAL OF CHEMICAL PHYSICS Weber, J. K., Pande, V. S. 2013; 139 (12)


    The protein folding problem has long represented a "holy grail" in statistical physics due to its physical complexity and its relevance to many human diseases. While past theoretical work has yielded apt descriptions of protein folding landscapes, recent large-scale simulations have provided insights into protein folding that were impractical to obtain from early theories. In particular, the role that non-native contacts play in protein folding, and their relation to the existence of misfolded, β-sheet rich trap states on folding landscapes, has emerged as a topic of interest in the field. In this paper, we present a modified model of heteropolymer freezing that includes explicit secondary structural characteristics which allow observations of "intramolecular amyloid" states to be probed from a theoretical perspective. We introduce a variable persistence length-based energy penalty to a model Hamiltonian, and we illustrate how this modification alters the phase transitions present in the theory. We find, in particular, that inclusion of this variable persistence length increases both generic freezing and folding temperatures in the model, allowing both folding and glass transitions to occur in a more highly optimized fashion. We go on to discuss how these changes might relate to protein evolution, misfolding, and the emergence of intramolecular amyloid states.

    View details for DOI 10.1063/1.4816633

    View details for Web of Science ID 000325392000020

    View details for PubMedID 24089729

  • Systematic improvement of a classical molecular model of water. journal of physical chemistry. B Wang, L., Head-Gordon, T., Ponder, J. W., Ren, P., Chodera, J. D., Eastman, P. K., Martinez, T. J., Pande, V. S. 2013; 117 (34): 9956-9972

    View details for DOI 10.1021/jp403802c

    View details for PubMedID 23750713

  • K-Means for Parallel Architectures Using All-Prefix-Sum Sorting and Updating Steps IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS Kohlhoff, K. J., Pande, V. S., Altman, R. B. 2013; 24 (8): 1602-1612
  • Long Timestep Molecular Dynamics on the Graphical Processing Unit JOURNAL OF CHEMICAL THEORY AND COMPUTATION Sweet, J. C., Nowling, R. J., Cickovski, T., Sweet, C. R., Pande, V. S., Izaguirre, J. A. 2013; 9 (8): 3267-3281

    View details for DOI 10.1021/ct400331r

    View details for Web of Science ID 000323193500002

  • Learning Kinetic Distance Metrics for Markov State Models of Protein Conformational Dynamics JOURNAL OF CHEMICAL THEORY AND COMPUTATION McGibbon, R. T., Pande, V. S. 2013; 9 (7): 2900-2906

    View details for DOI 10.1021/ct400132h

    View details for Web of Science ID 000321793100006

  • Functional understanding of solvent structure in GroEL cavity through dipole field analysis. journal of chemical physics Weber, J. K., Pande, V. S. 2013; 138 (16): 165101-?


    Solvent plays a ubiquitous role in all biophysical phenomena. Yet, just how the molecular nature of water impacts processes in biology remains an important question. While one can simulate the behavior of water near biomolecules such as proteins, it is challenging to gauge the potential structural role solvent plays in mediating both kinetic and equilibrium processes. Here, we propose an analysis scheme for understanding the nature of solvent structure at a local level. We first calculate coarse-grained dipole vector fields for an explicitly solvated system simulated through molecular dynamics. We then analyze correlations between these vector fields to characterize water structure under biologically relevant conditions. In applying our method to the interior of the wild type chaperonin complex GroEL+ES, along with nine additional mutant GroEL complexes, we find that dipole field correlations are strongly related to chaperonin function.

    View details for DOI 10.1063/1.4801942

    View details for PubMedID 23635172

  • Emergence of glass-like behavior in Markov state models of protein folding dynamics. Journal of the American Chemical Society Weber, J. K., Jack, R. L., Pande, V. S. 2013; 135 (15): 5501-5504


    The extent to which glass-like kinetics govern dynamics in protein folding has been heavily debated. Here, we address the subject with an application of space-time perturbation theory to the dynamics of protein folding Markov state models. Borrowing techniques from the s-ensemble method, we argue that distinct active and inactive phases exist for protein folding dynamics, and that kinetics for specific systems can fall into either dynamical regime. We do not, however, observe a true glass transition in any system studied. We go on to discuss how these inactive and active phases might relate to general protein folding properties.

    View details for DOI 10.1021/ja4002663

    View details for PubMedID 23540906

  • Persistent Topology and Metastable State in Conformational Dynamics PLOS ONE Chang, H., Bacallado, S., Pande, V. S., Carlsson, G. E. 2013; 8 (4)

    View details for DOI 10.1371/journal.pone.0058699

    View details for Web of Science ID 000317717300006

    View details for PubMedID 23565139

  • Improvements in Markov State Model Construction Reveal Many Non-Native Interactions in the Folding of NTL9 JOURNAL OF CHEMICAL THEORY AND COMPUTATION Schwantes, C. R., Pande, V. S. 2013; 9 (4): 2000-2009

    View details for DOI 10.1021/ct300878a

    View details for Web of Science ID 000317438100015

  • MSMExplorer: visualizing Markov state models for biomolecule folding simulations BIOINFORMATICS Cronkite-Ratcliff, B., Pande, V. 2013; 29 (7): 950-952


    Markov state models (MSMs) for the study of biomolecule folding simulations have emerged as a powerful tool for computational study of folding dynamics. MSMExplorer is a visualization application purpose-built to visualize these MSMs with an aim to increase the efficacy and reach of MSM science.MSMExplorer is available for download from The source code is made available under the GNU Lesser General Public License at

    View details for DOI 10.1093/bioinformatics/btt051

    View details for Web of Science ID 000316695700018

    View details for PubMedID 23365411

  • Derivation and assessment of phase-shifted, disordered vector field models for frustrated solvent interactions JOURNAL OF CHEMICAL PHYSICS Weber, J. K., Pande, V. S. 2013; 138 (8)


    The structure and properties of water at biological interfaces differ drastically from bulk due to effects including confinement and the presence of complicated charge distributions. This non-bulk-like behavior generally arises from water frustration, wherein all favorable interactions among water molecules cannot be simultaneously satisfied. While the frustration of interfacial water is ubiquitous in the cell, the role this frustration plays in mediating biophysical processes like protein folding is not well understood. To investigate the impact of frustration at interfaces, we here derive a general field theoretic model for the interaction of bulk and disordered vector fields at an embedded surface. We calculate thermodynamic and correlation functions for the model in two and three dimensions, and we compare our results to Monte Carlo simulations of lattice system analogs. In our analysis, we see that field-field cross correlations near the interface in the model give rise to a loss in entropy like that seen in glassy systems. We conclude by assessing our theory's utility as a coarse-grained model for water at polar biological interfaces.

    View details for DOI 10.1063/1.4792638

    View details for Web of Science ID 000315667800048

    View details for PubMedID 23464179

  • To milliseconds and beyond: challenges in the simulation of protein folding CURRENT OPINION IN STRUCTURAL BIOLOGY Lane, T. J., Shukla, D., Beauchamp, K. A., Pande, V. S. 2013; 23 (1): 58-65


    Quantitatively accurate all-atom molecular dynamics (MD) simulations of protein folding have long been considered a holy grail of computational biology. Due to the large system sizes and long timescales involved, such a pursuit was for many years computationally intractable. Further, sufficiently accurate forcefields needed to be developed in order to realistically model folding. This decade, however, saw the first reports of folding simulations describing kinetics on the order of milliseconds, placing many proteins firmly within reach of these methods. Progress in sampling and forcefield accuracy, however, presents a new challenge: how to turn huge MD datasets into scientific understanding. Here, we review recent progress in MD simulation techniques and show how the vast datasets generated by such techniques present new challenges for analysis. We critically discuss the state of the art, including reaction coordinate and Markov state model (MSM) methods, and provide a perspective for the future.

    View details for DOI 10.1016/

    View details for Web of Science ID 000315832700008

    View details for PubMedID 23237705

  • Building Markov state models with solvent dynamics 11th Asia Pacific Bioinformatics Conference (APBC) Gu, C., Chang, H., Maibaum, L., Pande, V. S., Carlsson, G. E., Guibas, L. J. BIOMED CENTRAL LTD. 2013


    Markov state models have been widely used to study conformational changes of biological macromolecules. These models are built from short timescale simulations and then propagated to extract long timescale dynamics. However, the solvent information in molecular simulations are often ignored in current methods, because of the large number of solvent molecules in a system and the indistinguishability of solvent molecules upon their exchange.We present a solvent signature that compactly summarizes the solvent distribution in the high-dimensional data, and then define a distance metric between different configurations using this signature. We next incorporate the solvent information into the construction of Markov state models and present a fast geometric clustering algorithm which combines both the solute-based and solvent-based distances.We have tested our method on several different molecular dynamical systems, including alanine dipeptide, carbon nanotube, and benzene rings. With the new solvent-based signatures, we are able to identify different solvent distributions near the solute. Furthermore, when the solute has a concave shape, we can also capture the water number inside the solute structure. Finally we have compared the performances of different Markov state models. The experiment results show that our approach improves the existing methods both in the computational running time and the metastability.In this paper we have initiated an study to build Markov state models for molecular dynamical systems with solvent degrees of freedom. The methods we described should also be broadly applicable to a wide range of biomolecular simulation analyses.

    View details for DOI 10.1186/1471-2105-14-S2-S8

    View details for Web of Science ID 000314468200008

    View details for PubMedID 23368418

  • Molecular dynamics simulations for the ranking, evaluation, and refinement of computationally designed proteins. Methods in enzymology Kiss, G., Pande, V. S., Houk, K. N. 2013; 523: 145-170


    Computational methods have been developed to redesign proteins so that they can perform novel functions such as the catalysis of nonnatural reactions. Active sites are constructed from the inside out by stochastically exploring mutations that favor the binding of transition states, small molecule binders, and protein surfaces-depending on the task at hand. The approach allows the use of many proteins for engineering scaffolds upon which to erect the necessary functionality. Beyond being of practical value for producing proteins with new applications, the approach tests our understanding of protein chemistry. The current success rate, however, is rather modest, and so the designers have become good only at making catalysts with low catalytic efficiencies. Directed evolution can be used to enhance function and stability, while more advanced computational techniques and physics-based simulations are useful at elucidating structural flaws and at guiding the design process. Here, we summarize work that focuses on the dynamic properties of computationally designed enzymes and their directed evolution variants. We utilized in silico methods to address three questions: (1) What are the shortcomings of these designs? (2) Can they be improved? (3) Can we screen out designs that are likely to be inactive?

    View details for DOI 10.1016/B978-0-12-394292-0.00007-2

    View details for PubMedID 23422429

  • Improvements in Markov State Model Construction Reveal Many Non-Native Interactions in the Folding of NTL9. Journal of chemical theory and computation 2013; 9 (4): 2000–2009


    Markov State Models (MSMs) provide an automated framework to investigate the dynamical properties of high-dimensional molecular simulations. These models can provide a human-comprehensible picture of the underlying process, and have been successfully used to study protein folding, protein aggregation, protein ligand binding, and other biophysical systems. The MSM requires the construction of a discrete state-space such that two points are in the same state if they can interconvert rapidly. In the following, we suggest an improved method, which utilizes second order Independent Components Analysis (also known as time-structure based Independent Components Analysis, or tICA), to construct the state-space. We apply this method to simulations of NTL9 (provided by Lindorff-Larsen et al. Science 2011), and show that the MSM is an improvement over previously built models using conventional distance metrics. Additionally, the resulting model provides insight into the role of non-native contacts by revealing many slow timescales associated with compact, non-native states.

    View details for DOI 10.1021/ct300878a

    View details for PubMedID 23750122

  • Persistent topology and metastable state in conformational dynamics. PloS one Chang, H., Bacallado, S., Pande, V. S., Carlsson, G. E. 2013; 8 (4)


    The large amount of molecular dynamics simulation data produced by modern computational models brings big opportunities and challenges to researchers. Clustering algorithms play an important role in understanding biomolecular kinetics from the simulation data, especially under the Markov state model framework. However, the ruggedness of the free energy landscape in a biomolecular system makes common clustering algorithms very sensitive to perturbations of the data. Here, we introduce a data-exploratory tool which provides an overview of the clustering structure under different parameters. The proposed Multi-Persistent Clustering analysis combines insights from recent studies on the dynamics of systems with dominant metastable states with the concept of multi-dimensional persistence in computational topology. We propose to explore the clustering structure of the data based on its persistence on scale and density. The analysis provides a systematic way to discover clusters that are robust to perturbations of the data. The dominant states of the system can be chosen with confidence. For the clusters on the borderline, the user can choose to do more simulation or make a decision based on their structural characteristics. Furthermore, our multi-resolution analysis gives users information about the relative potential of the clusters and their hierarchical relationship. The effectiveness of the proposed method is illustrated in three biomolecules: alanine dipeptide, Villin headpiece, and the FiP35 WW domain.

    View details for DOI 10.1371/journal.pone.0058699

    View details for PubMedID 23565139

  • OpenMM 4: A Reusable, Extensible, Hardware Independent Library for High Performance Molecular Simulation JOURNAL OF CHEMICAL THEORY AND COMPUTATION Eastman, P., Friedrichs, M. S., Chodera, J. D., Radmer, R. J., Bruns, C. M., Ku, J. P., Beauchamp, K. A., Lane, T. J., Wang, L., Shukla, D., Tye, T., Houston, M., Stich, T., Klein, C., Shirts, M. R., Pande, V. S. 2013; 9 (1): 461-469


    OpenMM is a software toolkit for performing molecular simulations on a range of high performance computing architectures. It is based on a layered architecture: the lower layers function as a reusable library that can be invoked by any application, while the upper layers form a complete environment for running molecular simulations. The library API hides all hardware-specific dependencies and optimizations from the users and developers of simulation programs: they can be run without modification on any hardware on which the API has been implemented. The current implementations of OpenMM include support for graphics processing units using the OpenCL and CUDA frameworks. In addition, OpenMM was designed to be extensible, so new hardware architectures can be accommodated and new functionality (e.g., energy terms and integrators) can be easily added.

    View details for DOI 10.1021/ct300857j

    View details for Web of Science ID 000313378700049

  • Effects of Familial Mutations on the Monomer Structure of A beta(42) BIOPHYSICAL JOURNAL Lin, Y., Pande, V. S. 2012; 103 (12): L47-L49


    Amyloid beta (Aβ) peptide plays an important role in Alzheimer's disease. A number of mutations in the Aβ sequence lead to familial Alzheimer's disease, congophilic amyloid angiopathy, or hereditary cerebral hemorrhage with amyloid. Using molecular dynamics simulations of ∼200 μs for each system, we characterize and contrast the consequences of four pathogenic mutations (Italian, Dutch, Arctic, and Iowa) for the structural ensemble of the Aβ monomer. The four familial mutations are found to have distinct consequences for the monomer structure.

    View details for DOI 10.1016/j.bpj.2012.11.009

    View details for Web of Science ID 000312527500001

    View details for PubMedID 23260058

  • Eigenvalues of the homogeneous finite linear one step master equation: Applications to downhill folding JOURNAL OF CHEMICAL PHYSICS Lane, T. J., Pande, V. S. 2012; 137 (21)


    Motivated by the observed time scales in protein systems said to fold "downhill," we have studied the finite, linear master equation, with uniform rates forward and backward as a model of the downhill process. By solving for the system eigenvalues, we prove the claim that in situations where there is no free energy barrier a transition between single- and multi-exponential kinetics occurs at sufficient bias (towards the native state). Consequences for protein folding, especially the downhill folding scenario, are briefly discussed.

    View details for DOI 10.1063/1.4769295

    View details for Web of Science ID 000312252900055

    View details for PubMedID 23231265

  • Reducing the effect of Metropolization on mixing times in molecular dynamics simulations JOURNAL OF CHEMICAL PHYSICS Wagoner, J. A., Pande, V. S. 2012; 137 (21)


    Molecular dynamics algorithms are subject to some amount of error dependent on the size of the time step that is used. This error can be corrected by periodically updating the system with a Metropolis criterion, where the integration step is treated as a selection probability for candidate state generation. Such a method, closely related to generalized hybrid Monte Carlo (GHMC), satisfies the balance condition by imposing a reversal of momenta upon candidate rejection. In the present study, we demonstrate that such momentum reversals can have a significant impact on molecular kinetics and extend the time required for system decorrelation, resulting in an order of magnitude increase in the integrated autocorrelation times of molecular variables for the worst cases. We present a simple method, referred to as reduced-flipping GHMC, that uses the information of the previous, current, and candidate states to reduce the probability of momentum flipping following candidate rejection while rigorously satisfying the balance condition. This method is a simple modification to traditional, automatic-flipping, GHMC methods and significantly mitigates the impact of such algorithms on molecular kinetics and simulation mixing times.

    View details for DOI 10.1063/1.4769301

    View details for Web of Science ID 000312252900005

    View details for PubMedID 23231215

  • Mechanistic and structural insight into the functional dichotomy between IL-2 and IL-15. Nature immunology Ring, A. M., Lin, J., Feng, D., Mitra, S., Rickert, M., Bowman, G. R., Pande, V. S., Li, P., Moraga, I., Spolski, R., Ozkan, E., Leonard, W. J., Garcia, K. C. 2012; 13 (12): 1187-1195


    Interleukin 15 (IL-15) and IL-2 have distinct immunological functions even though both signal through the receptor subunit IL-2Rβ and the common γ-chain (γ(c)). Here we found that in the structure of the IL-15-IL-15Rα-IL-2Rβ-γ(c) quaternary complex, IL-15 binds to IL-2Rβ and γ(c) in a heterodimer nearly indistinguishable from that of the IL-2-IL-2Rα-IL-2Rβ-γ(c) complex, despite their different receptor-binding chemistries. IL-15Rα substantially increased the affinity of IL-15 for IL-2Rβ, and this allostery was required for IL-15 trans signaling. Consistent with their identical IL-2Rβ-γ(c) dimer geometries, IL-2 and IL-15 showed similar signaling properties in lymphocytes, with any differences resulting from disparate receptor affinities. Thus, IL-15 and IL-2 induced similar signals, and the cytokine specificity of IL-2Rα versus IL-15Rα determined cellular responsiveness. Our results provide new insights for the development of specific immunotherapeutics based on IL-15 or IL-2.

    View details for DOI 10.1038/ni.2449

    View details for PubMedID 23104097

    View details for PubMedCentralID PMC3501574

  • Mechanistic and structural insight into the functional dichotomy between IL-2 and IL-15 NATURE IMMUNOLOGY Ring, A. M., Lin, J., Feng, D., Mitra, S., Rickert, M., Bowman, G. R., Pande, V. S., Li, P., Moraga, I., Spolski, R., Oezkan, E., Leonard, W. J., Garcia, K. C. 2012; 13 (12): 1187-?

    View details for DOI 10.1038/ni.2449

    View details for Web of Science ID 000311217900012

  • Marked difference in saxitoxin and tetrodotoxin affinity for the human nociceptive voltage-gated sodium channel (Nav1.7) [corrected]. Proceedings of the National Academy of Sciences of the United States of America Walker, J. R., Novick, P. A., Parsons, W. H., McGregor, M., Zablocki, J., Pande, V. S., Du Bois, J. 2012; 109 (44): 18102-18107


    Human nociceptive voltage-gated sodium channel (Na(v)1.7), a target of significant interest for the development of antinociceptive agents, is blocked by low nanomolar concentrations of (-)-tetrodotoxin(TTX) but not (+)-saxitoxin (STX) and (+)-gonyautoxin-III (GTX-III). These findings question the long-accepted view that the 1.7 isoform is both tetrodotoxin- and saxitoxin-sensitive and identify the outer pore region of the channel as a possible target for the design of Na(v)1.7-selective inhibitors. Single- and double-point amino acid mutagenesis studies along with whole-cell electrophysiology recordings establish two domain III residues (T1398 and I1399), which occur as methionine and aspartate in other Na(v) isoforms, as critical determinants of STX and gonyautoxin-III binding affinity. An advanced homology model of the Na(v) pore region is used to provide a structural rationalization for these surprising results.

    View details for DOI 10.1073/pnas.1206952109

    View details for PubMedID 23077250

  • Marked difference in saxitoxin and tetrodoxin affinity for the human nociceptive voltage-gated sodium channel (Na(v)1.7) PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Walker, J. R., Novick, P. A., Parsons, W. H., McGregor, M., Zablocki, J., Pande, V. S., Du Bois, J. 2012; 109 (44): 18102-18107
  • Simple few-state models reveal hidden complexity in protein folding PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Beauchamp, K. A., McGibbon, R., Lin, Y., Pande, V. S. 2012; 109 (44): 17807-17813


    Markov state models constructed from molecular dynamics simulations have recently shown success at modeling protein folding kinetics. Here we introduce two methods, flux PCCA+ (FPCCA+) and sliding constraint rate estimation (SCRE), that allow accurate rate models from protein folding simulations. We apply these techniques to fourteen massive simulation datasets generated by Anton and Folding@home. Our protocol quantitatively identifies the suitability of describing each system using two-state kinetics and predicts experimentally detectable deviations from two-state behavior. An analysis of the villin headpiece and FiP35 WW domain detects multiple native substates that are consistent with experimental data. Applying the same protocol to GTT, NTL9, and protein G suggests that some beta containing proteins can form long-lived native-like states with small register shifts. Even the simplest protein systems show folding and functional dynamics involving three or more states.

    View details for DOI 10.1073/pnas.1201810109

    View details for Web of Science ID 000311149900034

    View details for PubMedID 22778442

  • Slow Unfolded-State Structuring in Acyl-CoA Binding Protein Folding Revealed by Simulation and Experiment JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Voelz, V. A., Jaeger, M., Yao, S., Chen, Y., Zhu, L., Waldauer, S. A., Bowman, G. R., Friedrichs, M., Bakajin, O., Lapidus, L. J., Weiss, S., Pande, V. S. 2012; 134 (30): 12565-12577


    Protein folding is a fundamental process in biology, key to understanding many human diseases. Experimentally, proteins often appear to fold via simple two- or three-state mechanisms involving mainly native-state interactions, yet recent network models built from atomistic simulations of small proteins suggest the existence of many possible metastable states and folding pathways. We reconcile these two pictures in a combined experimental and simulation study of acyl-coenzyme A binding protein (ACBP), a two-state folder (folding time ~10 ms) exhibiting residual unfolded-state structure, and a putative early folding intermediate. Using single-molecule FRET in conjunction with side-chain mutagenesis, we first demonstrate that the denatured state of ACBP at near-zero denaturant is unusually compact and enriched in long-range structure that can be perturbed by discrete hydrophobic core mutations. We then employ ultrafast laminar-flow mixing experiments to study the folding kinetics of ACBP on the microsecond time scale. These studies, along with Trp-Cys quenching measurements of unfolded-state dynamics, suggest that unfolded-state structure forms on a surprisingly slow (~100 μs) time scale, and that sequence mutations strikingly perturb both time-resolved and equilibrium smFRET measurements in a similar way. A Markov state model (MSM) of the ACBP folding reaction, constructed from over 30 ms of molecular dynamics trajectory data, predicts a complex network of metastable stables, residual unfolded-state structure, and kinetics consistent with experiment but no well-defined intermediate preceding the main folding barrier. Taken together, these experimental and simulation results suggest that the previously characterized fast kinetic phase is not due to formation of a barrier-limited intermediate but rather to a more heterogeneous and slow acquisition of unfolded-state structure.

    View details for DOI 10.1021/ja302528z

    View details for Web of Science ID 000306942600050

    View details for PubMedID 22747188

  • Precursor Directed Biosynthesis of an Orthogonally Functional Erythromycin Analogue: Selectivity in the Ribosome Macrolide Binding Pocket JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Harvey, C. J., Puglisi, J. D., Pande, V. S., Cane, D. E., Khosla, C. 2012; 134 (29): 12259-12265


    The macrolide antibiotic erythromycin A and its semisynthetic analogues have been among the most useful antibacterial agents for the treatment of infectious diseases. Using a recently developed chemical genetic strategy for precursor-directed biosynthesis and colony bioassay of 6-deoxyerythromycin D analogues, we identified a new class of alkynyl- and alkenyl-substituted macrolides with activities comparable to that of the natural product. Further analysis revealed a marked and unexpected dependence of antibiotic activity on the size and degree of unsaturation of the precursor. Based on these leads, we also report the precursor-directed biosynthesis of 15-propargyl erythromycin A, a novel antibiotic that not only is as potent as erythromycin A with respect to its ability to inhibit bacterial growth and cell-free ribosomal protein biosynthesis but also harbors an orthogonal functional group that is capable of facile chemical modification.

    View details for DOI 10.1021/ja304682q

    View details for Web of Science ID 000306724500075

    View details for PubMedID 22741553

  • A Simple Model Predicts Experimental Folding Rates and a Hub-Like Topology JOURNAL OF PHYSICAL CHEMISTRY B Lane, T. J., Pande, V. S. 2012; 116 (23): 6764-6774


    A simple model is presented that describes general features of protein folding, in good agreement with experimental results and detailed all-atom simulations. Starting from microscopic physics, and with no free parameters, this model predicts that protein folding occurs remarkably quickly because native-like states are kinetic hubs. A hub-like network arises naturally out of microscopic physical concerns, specifically the kinetic longevity of native contacts during a search of globular conformations. The model predicts folding times scaling as τ(f) ~ e(ξN) in the number of residues, but because the model shows ξ is small, the folding times are much faster than Levinthal's approximation. Importantly, the folding time scale is found to be small due to the topology and structure of the network. We show explicitly how our model agrees with generic experimental features of the folding process, including the scaling of τ(f) with N, two-state thermodynamics, a sharp peak in C(V), and native-state fluctuations.

    View details for DOI 10.1021/jp212332c

    View details for Web of Science ID 000305356100020

    View details for PubMedID 22452581

  • Exploiting a natural conformational switch to engineer an interleukin-2 'superkine' NATURE Levin, A. M., Bates, D. L., Ring, A. M., Krieg, C., Lin, J. T., Su, L., Moraga, I., Raeber, M. E., Bowman, G. R., Novick, P., Pande, V. S., Fathman, C. G., Boyman, O., Garcia, K. C. 2012; 484 (7395): 529-U159


    The immunostimulatory cytokine interleukin-2 (IL-2) is a growth factor for a wide range of leukocytes, including T cells and natural killer (NK) cells. Considerable effort has been invested in using IL-2 as a therapeutic agent for a variety of immune disorders ranging from AIDS to cancer. However, adverse effects have limited its use in the clinic. On activated T cells, IL-2 signals through a quaternary 'high affinity' receptor complex consisting of IL-2, IL-2Rα (termed CD25), IL-2Rβ and IL-2Rγ. Naive T cells express only a low density of IL-2Rβ and IL-2Rγ, and are therefore relatively insensitive to IL-2, but acquire sensitivity after CD25 expression, which captures the cytokine and presents it to IL-2Rβ and IL-2Rγ. Here, using in vitro evolution, we eliminated the functional requirement of IL-2 for CD25 expression by engineering an IL-2 'superkine' (also called super-2) with increased binding affinity for IL-2Rβ. Crystal structures of the IL-2 superkine in free and receptor-bound forms showed that the evolved mutations are principally in the core of the cytokine, and molecular dynamics simulations indicated that the evolved mutations stabilized IL-2, reducing the flexibility of a helix in the IL-2Rβ binding site, into an optimized receptor-binding conformation resembling that when bound to CD25. The evolved mutations in the IL-2 superkine recapitulated the functional role of CD25 by eliciting potent phosphorylation of STAT5 and vigorous proliferation of T cells irrespective of CD25 expression. Compared to IL-2, the IL-2 superkine induced superior expansion of cytotoxic T cells, leading to improved antitumour responses in vivo, and elicited proportionally less expansion of T regulatory cells and reduced pulmonary oedema. Collectively, we show that in vitro evolution has mimicked the functional role of CD25 in enhancing IL-2 potency and regulating target cell specificity, which has implications for immunotherapy.

    View details for DOI 10.1038/nature10975

    View details for Web of Science ID 000303200400054

    View details for PubMedID 22446627

    View details for PubMedCentralID PMC3338870

  • Design of beta-Amyloid Aggregation Inhibitors from a Predicted Structural Motif JOURNAL OF MEDICINAL CHEMISTRY Novick, P. A., Lopes, D. H., Branson, K. M., Esteras-Chopo, A., Graef, I. A., Bitan, G., Pande, V. S. 2012; 55 (7): 3002-3010


    Drug design studies targeting one of the primary toxic agents in Alzheimer's disease, soluble oligomers of amyloid β-protein (Aβ), have been complicated by the rapid, heterogeneous aggregation of Aβ and the resulting difficulty to structurally characterize the peptide. To address this, we have developed [Nle(35), D-Pro(37)]Aβ(42), a substituted peptide inspired from molecular dynamics simulations which forms structures stable enough to be analyzed by NMR. We report herein that [Nle(35), D-Pro(37)]Aβ(42) stabilizes the trimer and prevents mature fibril and β-sheet formation. Further, [Nle(35), D-Pro(37)]Aβ(42) interacts with WT Aβ(42) and reduces aggregation levels and fibril formation in mixtures. Using ligand-based drug design based on [Nle(35), D-Pro(37)]Aβ(42), a lead compound was identified with effects on inhibition similar to the peptide. The ability of [Nle(35), D-Pro(37)]Aβ(42) and the compound to inhibit the aggregation of Aβ(42) provides a novel tool to study the structure of Aβ oligomers. More broadly, our data demonstrate how molecular dynamics simulation can guide experiment for further research into AD.

    View details for DOI 10.1021/jm201332p

    View details for Web of Science ID 000302591100010

    View details for PubMedID 22420626

  • Are Protein Force Fields Getting Better? A Systematic Benchmark on 524 Diverse NMR Measurements JOURNAL OF CHEMICAL THEORY AND COMPUTATION Beauchamp, K. A., Lin, Y., Das, R., Pande, V. S. 2012; 8 (4): 1409-1414


    Recent hardware and software advances have enabled simulation studies of protein systems on biophysically-relevant timescales, often revealing the need for improved force fields. Although early force field development was limited by the lack of direct comparisons between simulation and experiment, recent work from several labs has demonstrated direct calculation of NMR observables from protein simulations. Here we quantitatively evaluate recent molecular dynamics force fields against a suite of 524 chemical shift and J coupling ((3)JH(N)H(α), (3)JH(N)C(β), (3)JH(α)C', (3)JH(N)C', and (3)JH(α)N) measurements on dipeptides, tripeptides, tetra-alanine, and ubiquitin. Of the force fields examined (ff96, ff99, ff03, ff03*, ff03w, ff99sb*, ff99sb-ildn, ff99sb-ildn-phi, ff99sb-ildn-nmr, CHARMM27, OPLS-AA), two force fields (ff99sb-ildn-phi, ff99sb-ildn-nmr) combining recent side chain and backbone torsion modifications achieve high accuracy in our benchmark. For the two optimal force fields, the calculation error is comparable to the uncertainty in the experimental comparison. This observation suggests that extracting additional force field improvements from NMR data may require increased accuracy in J coupling and chemical shift prediction. To further investigate the limitations of current force fields, we also consider conformational populations of dipeptides, which were recently estimated using vibrational spectroscopy.

    View details for DOI 10.1021/ct2007814

    View details for Web of Science ID 000302487700026

  • Simbios: an NIH national center for physics-based simulation of biological structures JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION Delp, S. L., Ku, J. P., Pande, V. S., Sherman, M. A., Altman, R. B. 2012; 19 (2): 186-189


    Physics-based simulation provides a powerful framework for understanding biological form and function. Simulations can be used by biologists to study macromolecular assemblies and by clinicians to design treatments for diseases. Simulations help biomedical researchers understand the physical constraints on biological systems as they engineer novel drugs, synthetic tissues, medical devices, and surgical interventions. Although individual biomedical investigators make outstanding contributions to physics-based simulation, the field has been fragmented. Applications are typically limited to a single physical scale, and individual investigators usually must create their own software. These conditions created a major barrier to advancing simulation capabilities. In 2004, we established a National Center for Physics-Based Simulation of Biological Structures (Simbios) to help integrate the field and accelerate biomedical research. In 6 years, Simbios has become a vibrant national center, with collaborators in 16 states and eight countries. Simbios focuses on problems at both the molecular scale and the organismal level, with a long-term goal of uniting these in accurate multiscale simulations.

    View details for DOI 10.1136/amiajnl-2011-000488

    View details for Web of Science ID 000300768100009

    View details for PubMedID 22081222

    View details for PubMedCentralID PMC3277621

  • Protein Folding Is Mechanistically Robust BIOPHYSICAL JOURNAL Weber, J. K., Pande, V. S. 2012; 102 (4): 859-867


    Markov state models (MSMs) have proven to be useful tools in simulating large and slowly-relaxing biological systems like proteins. MSMs model proteins through dynamics on a discrete-state energy landscape, allowing molecules to effectively sample large regions of phase space. In this work, we use aspects of MSMs to ask: is protein folding mechanistically robust? We first provide a definition of mechanism in the context of Markovian models, and we later use perturbation theory and the concept of parametric sloppiness to show that parts of the MSM eigenspectrum are resistant to perturbation. We introduce a new, to our knowledge, Bayesian metric by which eigenspectrum robustness can be evaluated, and we discuss the implications of mechanistic robustness and possible new applications of MSMs to understanding biophysical phenomena.

    View details for DOI 10.1016/j.bpj.2012.01.028

    View details for Web of Science ID 000300921600017

    View details for PubMedID 22385857

  • Calculation of rate spectra from noisy time series data PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS Voelz, V. A., Pande, V. S. 2012; 80 (2): 342-351


    As the resolution of experiments to measure folding kinetics continues to improve, it has become imperative to avoid bias that may come with fitting data to a predetermined mechanistic model. Toward this end, we present a rate spectrum approach to analyze timescales present in kinetic data. Computing rate spectra of noisy time series data via numerical discrete inverse Laplace transform is an ill-conditioned inverse problem, so a regularization procedure must be used to perform the calculation. Here, we show the results of different regularization procedures applied to noisy multiexponential and stretched exponential time series, as well as data from time-resolved folding kinetics experiments. In each case, the rate spectrum method recapitulates the relevant distribution of timescales present in the data, with different priors on the rate amplitudes naturally corresponding to common biases toward simple phenomenological models. These results suggest an attractive alternative to the "Occam's razor" philosophy of simply choosing models with the fewest number of relaxation rates. Proteins 2011;. © 2011 Wiley Periodicals, Inc.

    View details for DOI 10.1002/prot.23171

    View details for Web of Science ID 000298955600002

    View details for PubMedID 22095854

  • Sequence Coevolution between RNA and Protein Characterized by Mutual Information between Residue Triplets PLOS ONE Brandman, R., Brandman, Y., Pande, V. S. 2012; 7 (1)


    Coevolving residues in a multiple sequence alignment provide evolutionary clues of biophysical interactions in 3D structure. Despite a rich literature describing amino acid coevolution within or between proteins and nucleic acid coevolution within RNA, to date there has been no direct evidence of coevolution between protein and RNA. The ribosome, a structurally conserved macromolecular machine composed of over 50 interacting protein and RNA chains, provides a natural example of RNA/protein interactions that likely coevolved. We provide the first direct evidence of RNA/protein coevolution by characterizing the mutual information in residue triplets from a multiple sequence alignment of ribosomal protein L22 and neighboring 23S RNA. We define residue triplets as three positions in the multiple sequence alignment, where one position is from the 23S RNA and two positions are from the L22 protein. We show that residue triplets with high mutual information are more likely than residue doublets to be proximal in 3D space. Some high mutual information residue triplets cluster in a connected series across the L22 protein structure, similar to patterns seen in protein coevolution. We also describe RNA nucleotides for which switching from one nucleotide to another (or between purines and pyrimidines) results in a change in amino acid distribution for proximal amino acid positions. Multiple crystal structures for evolutionarily distinct ribosome species can provide structural evidence for these differences. For one residue triplet, a pyrimidine in one species is a purine in another, and RNA/protein hydrogen bonds are present in one species but not the other. The results provide the first direct evidence of RNA/protein coevolution by using higher order mutual information, suggesting that biophysical constraints on interacting RNA and protein chains are indeed a driving force in their evolution.

    View details for DOI 10.1371/journal.pone.0030022

    View details for Web of Science ID 000299771900038

    View details for PubMedID 22279560

  • Investigating How Peptide Length and a Pathogenic Mutation Modify the Structural Ensemble of Amyloid Beta Monomer BIOPHYSICAL JOURNAL Lin, Y., Bowman, G. R., Beauchamp, K. A., Pande, V. S. 2012; 102 (2): 315-324


    The aggregation of amyloid beta (Aβ) peptides plays an important role in the development of Alzheimer's disease. Despite extensive effort, it has been difficult to characterize the secondary and tertiary structure of the Aβ monomer, the starting point for aggregation, due to its hydrophobicity and high aggregation propensity. Here, we employ extensive molecular dynamics simulations with atomistic protein and water models to determine structural ensembles for Aβ(42), Aβ(40), and Aβ(42)-E22K (the Italian mutant) monomers in solution. Sampling of a total of >700 microseconds in all-atom detail with explicit solvent enables us to observe the effects of peptide length and a pathogenic mutation on the disordered Aβ monomer structural ensemble. Aβ(42) and Aβ(40) have crudely similar characteristics but reducing the peptide length from 42 to 40 residues reduces β-hairpin formation near the C-terminus. The pathogenic Italian E22K mutation induces helix formation in the region of residues 20-24. This structural alteration may increase helix-helix interactions between monomers, resulting in altered mechanism and kinetics of Aβ oligomerization.

    View details for DOI 10.1016/j.bpj.2011.12.002

    View details for Web of Science ID 000299244100017

    View details for PubMedID 22339868

  • A-Site Residues Move Independently from P-Site Residues in all-Atom Molecular Dynamics Simulations of the 70S Bacterial Ribosome PLOS ONE Brandman, R., Brandman, Y., Pande, V. S. 2012; 7 (1)


    The ribosome is a large macromolecular machine, and correlated motion between residues is necessary for coordinating function across multiple protein and RNA chains. We ran two all-atom, explicit solvent molecular dynamics simulations of the bacterial ribosome and calculated correlated motion between residue pairs by using mutual information. Because of the short timescales of our simulation (ns), we expect that dynamics are largely local fluctuations around the crystal structure. We hypothesize that residues that show coupled dynamics are functionally related, even on longer timescales. We validate our model by showing that crystallographic B-factors correlate well with the entropy calculated as part of our mutual information calculations. We reveal that A-site residues move relatively independently from P-site residues, effectively insulating A-site functions from P-site functions during translation.

    View details for DOI 10.1371/journal.pone.0029377

    View details for Web of Science ID 000301123400051

    View details for PubMedID 22235290

  • Markov State Model Reveals Folding and Functional Dynamics in Ultra-Long MD Trajectories JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Lane, T. J., Bowman, G. R., Beauchamp, K., Voelz, V. A., Pande, V. S. 2011; 133 (45): 18413-18419


    Two strategies have been recently employed to push molecular simulation to long, biologically relevant time scales: projection-based analysis of results from specialized hardware producing a small number of ultralong trajectories and the statistical interpretation of massive parallel sampling performed with Markov state models (MSMs). Here, we assess the MSM as an analysis method by constructing a Markov model from ultralong trajectories, specifically two previously reported 100 μs trajectories of the FiP35 WW domain (Shaw, D. E. Science 2010, 330, 341-346). We find that the MSM approach yields novel insights. It discovers new statistically significant folding pathways, in which either beta-hairpin of the WW domain can form first. The rates of this process approach experimental values in a direct quantitative comparison (time scales of 5.0 μs and 100 ns), within a factor of ∼2. Finally, the hub-like topology of the MSM and identification of a holo conformation predicts how WW domains may function through a conformational selection mechanism.

    View details for DOI 10.1021/ja207470h

    View details for Web of Science ID 000297381200067

    View details for PubMedID 21988563

  • Characterization and Rapid Sampling of Protein Folding Markov State Model Topologies JOURNAL OF CHEMICAL THEORY AND COMPUTATION Weber, J. K., Pande, V. S. 2011; 7 (10): 3405-3411


    Markov state models (MSMs) have proven themselves to be effective statistical and quantitative models for understanding protein folding dynamics. As stochastic networks, MSMs allow for descriptions of parallel folding pathways and facilitate quantitative comparison to experiments conducted at the ensemble level. While this complex network structure is advantageous in many respects, a simple topological description of these graphs is elusive. In this paper, we compare a series of protein folding MSMs to the topology of the Cayley tree, a graph structure on which dynamics are intuitive. We go on to introduce and test new sampling schemes that have potential to improve automated model construction, a critical step toward making Markov state modeling more accessible to general users.

    View details for DOI 10.1021/ct2004484

    View details for Web of Science ID 000295655000037

  • MSMBuilder2: Modeling Conformational Dynamics on the Picosecond to Millisecond Scale JOURNAL OF CHEMICAL THEORY AND COMPUTATION Beauchamp, K. A., Bowman, G. R., Lane, T. J., Maibaum, L., Haque, I. S., Pande, V. S. 2011; 7 (10): 3412-3419


    Markov State Models provide a framework for understanding the fundamental states and rates in the conformational dynamics of biomolecules. We describe an improved protocol for constructing Markov State Models from molecular dynamics simulations. The new protocol includes advances in clustering, data preparation, and model estimation; these improvements lead to significant increases in model accuracy, as assessed by the ability to recapitulate equilibrium and kinetic properties of reference systems. A high-performance implementation of this protocol, provided in MSMBuilder2, is validated on dynamics ranging from picoseconds to milliseconds.

    View details for DOI 10.1021/ct200463m

    View details for Web of Science ID 000295655000038

  • (Compressed) sensing and sensibility PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Pande, V. S. 2011; 108 (36): 14713-14714

    View details for DOI 10.1073/pnas.1111659108

    View details for Web of Science ID 000294543400009

    View details for PubMedID 21873202

  • Anatomy of High-Performance 2D Similarity Calculations JOURNAL OF CHEMICAL INFORMATION AND MODELING Haque, I. S., Pande, V. S., Walters, W. P. 2011; 51 (9): 2345-2351


    Similarity measures based on the comparison of dense bit vectors of two-dimensional chemical features are a dominant method in chemical informatics. For large-scale problems, including compound selection and machine learning, computing the intersection between two dense bit vectors is the overwhelming bottleneck. We describe efficient implementations of this primitive as well as example applications using features of modern CPUs that allow 20-40× performance increases relative to typical code. Specifically, we describe fast methods for population count on modern x86 processors and cache-efficient matrix traversal and leader clustering algorithms that alleviate memory bandwidth bottlenecks in similarity matrix construction and clustering. The speed of our 2D comparison primitives is within a small factor of that obtained on GPUs and does not require specialized hardware.

    View details for DOI 10.1021/ci200235e

    View details for Web of Science ID 000295114700030

    View details for PubMedID 21854053

  • Error Bounds on the SCISSORS Approximation Method JOURNAL OF CHEMICAL INFORMATION AND MODELING Haque, I. S., Pande, V. S. 2011; 51 (9): 2248-2253


    The SCISSORS method for approximating chemical similarities has shown excellent empirical performance on a number of real-world chemical data sets but lacks theoretically proven bounds on its worst-case error performance. This paper first proves reductions showing SCISSORS to be equivalent to two previous kernel methods: kernel principal components analysis and the rank-k Nyström approximation of a Gram matrix. These reductions allow the use of generalization bounds on these techniques to show that the expected error in SCISSORS approximations of molecular similarity kernels is bounded in expected pairwise inner product error, in matrix 2-norm and Frobenius norm for full kernel matrix approximations and in root-mean-square deviation for approximated matrices. Finally, we show that the actual performance of SCISSORS is significantly better than these worst-case bounds, indicating that chemical space is well-structured for chemical sampling algorithms.

    View details for DOI 10.1021/ci200251a

    View details for Web of Science ID 000295114700022

    View details for PubMedID 21851122

  • Splitting Probabilities as a Test of Reaction Coordinate Choice in Single-Molecule Experiments PHYSICAL REVIEW LETTERS Chodera, J. D., Pande, V. S. 2011; 107 (9)


    To explain the observed dynamics in equilibrium single-molecule measurements of biomolecules, the experimental observable is often chosen as a putative reaction coordinate along which kinetic behavior is presumed to be governed by diffusive dynamics. Here, we invoke the splitting probability as a test of the suitability of such a proposed reaction coordinate. Comparison of the observed splitting probability with that computed from the kinetic model provides a simple test to reject poor reaction coordinates. We demonstrate this test for a force spectroscopy measurement of a DNA hairpin.

    View details for DOI 10.1103/PhysRevLett.107.098102

    View details for Web of Science ID 000294268000021

    View details for PubMedID 21929272

  • CAMPAIGN: an open-source library of GPU-accelerated data clustering algorithms BIOINFORMATICS Kohlhoff, K. J., Sosnick, M. H., Hsu, W. T., Pande, V. S., Altman, R. B. 2011; 27 (16): 2322-2323


    Data clustering techniques are an essential component of a good data analysis toolbox. Many current bioinformatics applications are inherently compute-intense and work with very large datasets. Sequential algorithms are inadequate for providing the necessary performance. For this reason, we have created Clustering Algorithms for Massively Parallel Architectures, Including GPU Nodes (CAMPAIGN), a central resource for data clustering algorithms and tools that are implemented specifically for execution on massively parallel processing architectures.CAMPAIGN is a library of data clustering algorithms and tools, written in 'C for CUDA' for Nvidia GPUs. The library provides up to two orders of magnitude speed-up over respective CPU-based clustering algorithms and is intended as an open-source resource. New modules from the community will be accepted into the library and the layout of it is such that it can easily be extended to promising future platforms such as OpenCL.Releases of the CAMPAIGN library are freely available for download under the LGPL from Source code can also be obtained through anonymous subversion access as described on

    View details for DOI 10.1093/bioinformatics/btr386

    View details for Web of Science ID 000293620800028

    View details for PubMedID 21712246

    View details for PubMedCentralID PMC3150041

  • The social network (of protein conformations) PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Chodera, J. D., Pande, V. S. 2011; 108 (32): 12969-12970

    View details for DOI 10.1073/pnas.1109571108

    View details for Web of Science ID 000293691400010

    View details for PubMedID 21804033

  • Quantitative comparison of villin headpiece subdomain simulations and triplet-triplet energy transfer experiments PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Beauchamp, K. A., Ensign, D. L., Das, R., Pande, V. S. 2011; 108 (31): 12734-12739


    As the fastest folding protein, the villin headpiece (HP35) serves as an important bridge between simulation and experimental studies of protein folding. Despite the simplicity of this system, experiments continue to reveal a number of surprises, including structure in the unfolded state and complex equilibrium dynamics near the native state. Using 2.5 ms of molecular dynamics and Markov state models, we connect to current experimental results in three ways. First, we present and validate a novel method for the quantitative prediction of triplet-triplet energy transfer experiments. Second, we construct a many-state model for HP35 that is consistent with previous experiments. Finally, we predict contact-formation time traces for all 1,225 possible triplet-triplet energy transfer experiments on HP35.

    View details for DOI 10.1073/pnas.1010880108

    View details for Web of Science ID 000293385700043

    View details for PubMedID 21768345

  • Rationally Designed Turn Promoting Mutation in the Amyloid-beta Peptide Sequence Stabilizes Oligomers in Solution PLOS ONE Rajadas, J., Liu, C. W., Novick, P., Kelley, N. W., Inayathullah, M., LeMieux, M. C., Pande, V. S. 2011; 6 (7)


    Enhanced production of a 42-residue beta amyloid peptide (Aβ(42)) in affected parts of the brain has been suggested to be the main causative factor for the development of Alzheimer's Disease (AD). The severity of the disease depends not only on the amount of the peptide but also its conformational transition leading to the formation of oligomeric amyloid-derived diffusible ligands (ADDLs) in the brain of AD patients. Despite being significant to the understanding of AD mechanism, no atomic-resolution structures are available for these species due to the evanescent nature of ADDLs that hinders most structural biophysical investigations. Based on our molecular modeling and computational studies, we have designed Met35Nle and G37p mutations in the Aβ(42) peptide (Aβ(42)Nle35p37) that appear to organize Aβ(42) into stable oligomers. 2D NMR on the Aβ(42)Nle35p37 peptide revealed the occurrence of two β-turns in the V24-N27 and V36-V39 stretches that could be the possible cause for the oligomer stability. We did not observe corresponding NOEs for the V24-N27 turn in the Aβ(21-43)Nle35p37 fragment suggesting the need for the longer length amyloid peptide to form the stable oligomer promoting conformation. Because of the presence of two turns in the mutant peptide which were absent in solid state NMR structures for the fibrils, we propose, fibril formation might be hindered. The biophysical information obtained in this work could aid in the development of structural models for toxic oligomer formation that could facilitate the development of therapeutic approaches to AD.

    View details for DOI 10.1371/journal.pone.0021776

    View details for Web of Science ID 000293097300006

    View details for PubMedID 21799748

    View details for PubMedCentralID PMC3142112

  • Dynamical reweighting: Improved estimates of dynamical properties from simulations at multiple temperatures JOURNAL OF CHEMICAL PHYSICS Chodera, J. D., Swope, W. C., Noe, F., Prinz, J., Shirts, M. R., Pande, V. S. 2011; 134 (24)


    Dynamical averages based on functionals of dynamical trajectories, such as time-correlation functions, play an important role in determining kinetic or transport properties of matter. At temperatures of interest, the expectations of these quantities are often dominated by contributions from rare events, making the precise calculation of these quantities by molecular dynamics simulation difficult. Here, we present a reweighting method for combining simulations from multiple temperatures (or from simulated or parallel tempering simulations) to compute an optimal estimate of the dynamical properties at the temperature of interest without the need to invoke an approximate kinetic model (such as the Arrhenius law). Continuous and differentiable estimates of these expectations at any temperature in the sampled range can also be computed, along with an assessment of the associated statistical uncertainty. For rare events, aggregating data from multiple temperatures can produce an estimate with the desired precision at greatly reduced computational cost compared with simulations conducted at a single temperature. Here, we describe use of the method for the canonical (NVT) ensemble using four common models of dynamics (canonical distribution of Hamiltonian trajectories, Andersen thermostatting, Langevin, and overdamped Langevin or Brownian dynamics), but it can be applied to any thermodynamic ensemble provided the ratio of path probabilities at different temperatures can be computed. To illustrate the method, we compute a time-correlation function for solvated terminally-blocked alanine peptide across a range of temperatures using trajectories harvested using a modified parallel tempering protocol.

    View details for DOI 10.1063/1.3592152

    View details for Web of Science ID 000292331900010

    View details for PubMedID 21721612

  • Optimal use of data in parallel tempering simulations for the construction of discrete-state Markov models of biomolecular dynamics JOURNAL OF CHEMICAL PHYSICS Prinz, J., Chodera, J. D., Pande, V. S., Swope, W. C., Smith, J. C., Noe, F. 2011; 134 (24)


    Parallel tempering (PT) molecular dynamics simulations have been extensively investigated as a means of efficient sampling of the configurations of biomolecular systems. Recent work has demonstrated how the short physical trajectories generated in PT simulations of biomolecules can be used to construct the Markov models describing biomolecular dynamics at each simulated temperature. While this approach describes the temperature-dependent kinetics, it does not make optimal use of all available PT data, instead estimating the rates at a given temperature using only data from that temperature. This can be problematic, as some relevant transitions or states may not be sufficiently sampled at the temperature of interest, but might be readily sampled at nearby temperatures. Further, the comparison of temperature-dependent properties can suffer from the false assumption that data collected from different temperatures are uncorrelated. We propose here a strategy in which, by a simple modification of the PT protocol, the harvested trajectories can be reweighted, permitting data from all temperatures to contribute to the estimated kinetic model. The method reduces the statistical uncertainty in the kinetic model relative to the single temperature approach and provides estimates of transition probabilities even for transitions not observed at the temperature of interest. Further, the method allows the kinetics to be estimated at temperatures other than those at which simulations were run. We illustrate this method by applying it to the generation of a Markov model of the conformational dynamics of the solvated terminally blocked alanine peptide.

    View details for DOI 10.1063/1.3592153

    View details for Web of Science ID 000292331900011

    View details for PubMedID 21721613

  • A smoothly decoupled particle interface: New methods for coupling explicit and implicit solvent JOURNAL OF CHEMICAL PHYSICS Wagoner, J. A., Pande, V. S. 2011; 134 (21)


    A common theme of studies using molecular simulation is a necessary compromise between computational efficiency and resolution of the forcefield that is used. Significant efforts have been directed at combining multiple levels of granularity within a single simulation in order to maintain the efficiency of coarse-grained models, while using finer resolution in regions where such details are expected to play an important role. A specific example of this paradigm is the development of hybrid solvent models, which explicitly sample the solvent degrees of freedom within a specified domain while utilizing a continuum description elsewhere. Unfortunately, these models are complicated by the presence of structural artifacts at or near the explicit/implicit boundary. The presence of these artifacts significantly complicates the use of such models, both undermining the accuracy obtained and necessitating the parameterization of effective potentials to counteract the artificial interactions. In this work, we introduce a novel hybrid solvent model that employs a smoothly decoupled particle interface (SDPI), a switching region that gradually transitions from fully interacting particles to a continuum solvent. The resulting SDPI model allows for the use of an implicit solvent model based on a simple theory that needs to only reproduce the behavior of bulk solvent rather than the more complex features of local interactions. In this study, the SDPI model is tested on spherical hybrid domains using a coarse-grained representation of water that includes only Lennard-Jones interactions. The results demonstrate that this model is capable of reproducing solvent configurations absent of boundary artifacts, as if they were taken from full explicit simulations.

    View details for DOI 10.1063/1.3595262

    View details for Web of Science ID 000291402700004

    View details for PubMedID 21663340

  • Reintroducing Electrostatics into Macromolecular Crystallographic Refinement: Application to Neutron Crystallography and DNA Hydration STRUCTURE Fenn, T. D., Schnieders, M. J., Mustyakimov, M., Wu, C., Langan, P., Pande, V. S., Brunger, A. T. 2011; 19 (4): 523-533


    Most current crystallographic structure refinements augment the diffraction data with a priori information consisting of bond, angle, dihedral, planarity restraints, and atomic repulsion based on the Pauli exclusion principle. Yet, electrostatics and van der Waals attraction are physical forces that provide additional a priori information. Here, we assess the inclusion of electrostatics for the force field used for all-atom (including hydrogen) joint neutron/X-ray refinement. Two DNA and a protein crystal structure were refined against joint neutron/X-ray diffraction data sets using force fields without electrostatics or with electrostatics. Hydrogen-bond orientation/geometry favors the inclusion of electrostatics. Refinement of Z-DNA with electrostatics leads to a hypothesis for the entropic stabilization of Z-DNA that may partly explain the thermodynamics of converting the B form of DNA to its Z form. Thus, inclusion of electrostatics assists joint neutron/X-ray refinements, especially for placing and orienting hydrogen atoms.

    View details for DOI 10.1016/j.str.2011.01.015

    View details for Web of Science ID 000289592600011

    View details for PubMedID 21481775

    View details for PubMedCentralID PMC3083928

  • Alchemical free energy methods for drug discovery: progress and challenges CURRENT OPINION IN STRUCTURAL BIOLOGY Chodera, J. D., Mobley, D. L., Shirts, M. R., Dixon, R. W., Branson, K., Pande, V. S. 2011; 21 (2): 150-160


    Improved rational drug design methods are needed to lower the cost and increase the success rate of drug discovery and development. Alchemical binding free energy calculations, one potential tool for rational design, have progressed rapidly over the past decade, but still fall short of providing robust tools for pharmaceutical engineering. Recent studies, especially on model receptor systems, have clarified many of the challenges that must be overcome for robust predictions of binding affinity to be useful in rational design. In this review, inspired by a recent joint academic/industry meeting organized by the authors, we discuss these challenges and suggest a number of promising approaches for overcoming them.

    View details for DOI 10.1016/

    View details for Web of Science ID 000290009600002

    View details for PubMedID 21349700

  • Polarizable Atomic Multipole X-Ray Refinement: Particle Mesh Ewald Electrostatics for Macromolecular Crystals JOURNAL OF CHEMICAL THEORY AND COMPUTATION Schnieders, M. J., Fenn, T. D., Pande, V. S. 2011; 7 (4): 1141-1156

    View details for DOI 10.1021/ct100506d

    View details for Web of Science ID 000289315700036

  • Water Ordering at Membrane Interfaces Controls Fusion Dynamics JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Kasson, P. M., Lindahl, E., Pande, V. S. 2011; 133 (11): 3812-3815


    Membrane interfaces are critical to many cellular functions, yet the vast array of molecular components involved make the fundamental physics of interaction difficult to define. Water has been shown to play an important role in the dynamics of small biological systems, for example when trapped in hydrophobic regions, but the molecular details of water have generally been thought dispensable when considering large membrane interfaces. Nevertheless, spectroscopic data indicate that water has distinct, ordered behavior near membrane surfaces. While coarse-grained simulations have achieved success recently in aiding understanding the dynamics of membrane assemblies, it is natural to ask, does the missing chemical nature of water play an important role? We have therefore performed atomic-resolution simulations of vesicle fusion to understand the role of chemical detail, particularly the molecular structure of water, in membrane fusion and at membrane interfaces more generally. These membrane interfaces present a form of hydrophilic confinement, yielding surprising, non-bulk-like water behavior.

    View details for DOI 10.1021/ja200310d

    View details for Web of Science ID 000288889900033

    View details for PubMedID 21351772

  • A bundling of viral fusion mechanisms PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Kasson, P. M., Pande, V. S. 2011; 108 (10): 3827-3828

    View details for DOI 10.1073/pnas.1101072108

    View details for Web of Science ID 000288120400006

    View details for PubMedID 21368165

  • Atomistic Folding Simulations of the Five-Helix Bundle Protein lambda(6-85) JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Bowman, G. R., Voelz, V. A., Pande, V. S. 2011; 133 (4): 664-667


    Protein folding is a classic grand challenge that is relevant to numerous human diseases, such as protein misfolding diseases like Alzheimer’s disease. Solving the folding problem will ultimately require a combination of theory, simulation, and experiment, with theory and simulation providing an atomically detailed picture of both the thermodynamics and kinetics of folding and experimental tests grounding these models in reality. However, theory and simulation generally fall orders of magnitude short of biologically relevant time scales. Here we report significant progress toward closing this gap: an atomistic model of the folding of an 80-residue fragment of the λ repressor protein with explicit solvent that captures dynamics on a 10 milliseconds time scale. In addition, we provide a number of predictions that warrant further experimental investigation. For example, our model’s native state is a kinetic hub, and biexponential kinetics arises from the presence of many free-energy basins separated by barriers of different heights rather than a single low barrier along one reaction coordinate (the previously proposed incipient downhill folding scenario).

    View details for DOI 10.1021/ja106936n

    View details for Web of Science ID 000287295300007

    View details for PubMedID 21174461

  • Taming the complexity of protein folding CURRENT OPINION IN STRUCTURAL BIOLOGY Bowman, G. R., Voelz, V. A., Pande, V. S. 2011; 21 (1): 4-11


    Protein folding is an important problem in structural biology with significant medical implications, particularly for misfolding disorders like Alzheimer's disease. Solving the folding problem will ultimately require a combination of theory and experiment, with theoretical models providing a comprehensive view of folding and experiments grounding these models in reality. Here we review progress towards this goal over the past decade, with an emphasis on recent theoretical advances that are empowering chemically detailed models of folding and the new results these technologies are providing. In particular, we discuss new insights made possible by Markov state models (MSMs), including the role of non-native contacts and the hub-like character of protein folded states.

    View details for DOI 10.1016/

    View details for Web of Science ID 000287901600002

    View details for PubMedID 21081274

  • Calculation of Local Water Densities in Biological Systems: A Comparison of Molecular Dynamics Simulations and the 3D-RISM-KH Molecular Theory of Solvation JOURNAL OF PHYSICAL CHEMISTRY B Stumpe, M. C., Blinov, N., Wishart, D., Kovalenko, A., Pande, V. S. 2011; 115 (2): 319-328


    Water plays a unique role in all living organisms. Not only is it nature's ubiquitous solvent, but it also actively takes part in many cellular processes. In particular, the structure and properties of interfacial water near biomolecules such as proteins are often related to the function of the respective molecule. It can therefore be highly instructive to study the local water density around solutes in cellular systems, particularly when solvent-mediated forces such as the hydrophobic effect are relevant. Computational methods such as molecular dynamics (MD) simulations seem well suited to study these systems at the atomic level. However, due to sampling requirements, it is not clear that MD simulations are, indeed, the method of choice to obtain converged densities at a given level of precision. We here compare the calculation of local water densities with two different methods: MD simulations and the three-dimensional reference interaction site model with the Kovalenko-Hirata closure (3D-RISM-KH). In particular, we investigate the convergence of the local water density to assess the required simulation times for different levels of resolution. Moreover, we provide a quantitative comparison of the densities calculated with MD and with 3D-RISM-KH and investigate the effect of the choice of the water model for both methods. Our results show that 3D-RISM-KH yields density distributions that are very similar to those from MD up to a 0.5 Å resolution, but for significantly reduced computational cost. The combined use of MD and 3D-RISM-KH emerges as an auspicious perspective for efficient solvent sampling in dynamical systems.

    View details for DOI 10.1021/jp102587q

    View details for Web of Science ID 000286090200012

    View details for PubMedID 21174421

  • Simple Theory of Protein Folding Kinetics PHYSICAL REVIEW LETTERS Pande, V. S. 2010; 105 (19)


    We present a simple model of protein folding dynamics that captures key qualitative elements recently seen in all-atom simulations. The goals of this theory are to serve as a simple formalism for gaining deeper insight into the physical properties seen in detailed simulations as well as to serve as a model to easily compare why these simulations suggest a different kinetic mechanism than previous simple models. Specifically, we find that non-native contacts play a key role in determining the mechanism, which can shift dramatically as the energetic strength of non-native interactions is changed. For proteinlike non-native interactions, our model finds that the native state is a kinetic hub, connecting the strength of relevant interactions directly to the nature of folding kinetics.

    View details for DOI 10.1103/PhysRevLett.105.198101

    View details for Web of Science ID 000283849300018

    View details for PubMedID 21231198

  • Non-Bulk-Like Solvent Behavior in the Ribosome Exit Tunnel PLOS COMPUTATIONAL BIOLOGY Lucent, D., Snow, C. D., Aitken, C. E., Pande, V. S. 2010; 6 (10)


    As nascent proteins are synthesized by the ribosome, they depart via an exit tunnel running through the center of the large subunit. The exit tunnel likely plays an important part in various aspects of translation. Although water plays a key role in many bio-molecular processes, the nature of water confined to the exit tunnel has remained unknown. Furthermore, solvent in biological cavities has traditionally been characterized as either a continuous dielectric fluid, or a discrete tightly bound molecule. Using atomistic molecular dynamics simulations, we predict that the thermodynamic and kinetic properties of water confined within the ribosome exit tunnel are quite different from this simple two-state model. We find that the tunnel creates a complex microenvironment for the solvent resulting in perturbed rotational dynamics and heterogenous dielectric behavior. This gives rise to a very rugged solvation landscape and significantly retarded solvent diffusion. We discuss how this non-bulk-like solvent is likely to affect important biophysical processes such as sequence dependent stalling, co-translational folding, and antibiotic binding. We conclude with a discussion of the general applicability of these results to other biological cavities.

    View details for DOI 10.1371/journal.pcbi.1000963

    View details for Web of Science ID 000283651900010

    View details for PubMedID 20975935

  • Everything you wanted to know about Markov State Models but were afraid to ask METHODS Pande, V. S., Beauchamp, K., Bowman, G. R. 2010; 52 (1): 99-105


    Simulating protein folding has been a challenging problem for decades due to the long timescales involved (compared with what is possible to simulate) and the challenges of gaining insight from the complex nature of the resulting simulation data. Markov State Models (MSMs) present a means to tackle both of these challenges, yielding simulations on experimentally relevant timescales, statistical significance, and coarse grained representations that are readily humanly understandable. Here, we review this method with the intended audience of non-experts, in order to introduce the method to a broader audience. We review the motivations, methods, and caveats of MSMs, as well as some recent highlights of applications of the method. We conclude by discussing how this approach is part of a paradigm shift in how one uses simulations, away from anecdotal single-trajectory approaches to a more comprehensive statistical approach.

    View details for DOI 10.1016/j.ymeth.2010.06.002

    View details for Web of Science ID 000281941300011

    View details for PubMedID 20570730

  • Bayesian inference for Brownian dynamics PHYSICAL REVIEW E Ensign, D. L., Pande, V. S. 2010; 82 (1)


    We present a Bayesian method for inferring the potential energy experienced by a particle subject to Brownian dynamics. Assuming polynomial potentials, the best polynomial order can be determined by analytical computation of a series of Bayes factors. The coefficients can be estimated from marginal posterior distributions. The method is applicable not only for the motion of an actual Brownian particle but to many kinds of single degree-of-freedom trajectories with Gaussian noise and short, nonzero correlation times.

    View details for DOI 10.1103/PhysRevE.82.016705

    View details for Web of Science ID 000280068000007

    View details for PubMedID 20866759

  • OpenMM: A Hardware-Independent Framework for Molecular Simulations COMPUTING IN SCIENCE & ENGINEERING Eastman, P., Pande, V. S. 2010; 12 (4): 34-39
  • Polarizable Atomic Multipole X-Ray Refinement: Hydration Geometry and Application to Macromolecules BIOPHYSICAL JOURNAL Fenn, T. D., Schnieders, M. J., Brunger, A. T., Pande, V. S. 2010; 98 (12): 2984-2992


    We recently developed a polarizable atomic multipole refinement method assisted by the AMOEBA force field for macromolecular crystallography. Compared to standard refinement procedures, the method uses a more rigorous treatment of x-ray scattering and electrostatics that can significantly improve the resultant information contained in an atomic model. We applied this method to high-resolution lysozyme and trypsin data sets, and validated its utility for precisely describing biomolecular electron density, as indicated by a 0.4-0.6% decrease in the R- and R(free)-values, and a corresponding decrease in the relative energy of 0.4-0.8 Kcal/mol/residue. The re-refinements illustrate the ability of force-field electrostatics to orient water networks and catalytically relevant hydrogens, which can be used to make predictions regarding active site function, activity, and protein-ligand interaction energies. Re-refinement of a DNA crystal structure generates the zigzag spine pattern of hydrogen bonding in the minor groove without manual intervention. The polarizable atomic multipole electrostatics model implemented in the AMOEBA force field is applicable and informative for crystal structures solved at any resolution.

    View details for DOI 10.1016/j.bpj.2010.02.057

    View details for Web of Science ID 000278913500027

    View details for PubMedID 20550911

  • Protein folded states are kinetic hubs PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Bowman, G. R., Pande, V. S. 2010; 107 (24): 10890-10895


    Understanding molecular kinetics, and particularly protein folding, is a classic grand challenge in molecular biophysics. Network models, such as Markov state models (MSMs), are one potential solution to this problem. MSMs have recently yielded quantitative agreement with experimentally derived structures and folding rates for specific systems, leaving them positioned to potentially provide a deeper understanding of molecular kinetics that can lead to experimentally testable hypotheses. Here we use existing MSMs for the villin headpiece and NTL9, which were constructed from atomistic simulations, to accomplish this goal. In addition, we provide simpler, humanly comprehensible networks that capture the essence of molecular kinetics and reproduce qualitative phenomena like the apparent two-state folding often seen in experiments. Together, these models show that protein dynamics are dominated by stochastic jumps between numerous metastable states and that proteins have heterogeneous unfolded states (many unfolded basins that interconvert more rapidly with the native state than with one another) yet often still appear two-state. Most importantly, we find that protein native states are hubs that can be reached quickly from any other state. However, metastability and a web of nonnative states slow the average folding rate. Experimental tests for these findings and their implications for other fields, like protein design, are also discussed.

    View details for DOI 10.1073/pnas.1003962107

    View details for Web of Science ID 000278807400023

    View details for PubMedID 20534497

  • Network models for molecular kinetics and their initial applications to human health CELL RESEARCH Bowman, G. R., Huang, X., Pande, V. S. 2010; 20 (6): 622-630


    Molecular kinetics underlies all biological phenomena and, like many other biological processes, may best be understood in terms of networks. These networks, called Markov state models (MSMs), are typically built from physical simulations. Thus, they are capable of quantitative prediction of experiments and can also provide an intuition for complex conformational changes. Their primary application has been to protein folding; however, these technologies and the insights they yield are transferable. For example, MSMs have already proved useful in understanding human diseases, such as protein misfolding and aggregation in Alzheimer's disease.

    View details for DOI 10.1038/cr.2010.57

    View details for Web of Science ID 000278726600006

    View details for PubMedID 20421891

  • Atomic-Resolution Simulations Predict a Transition State for Vesicle Fusion Defined by Contact of a Few Lipid Tails PLOS COMPUTATIONAL BIOLOGY Kasson, P. M., Lindahl, E., Pande, V. S. 2010; 6 (6)


    Membrane fusion is essential to both cellular vesicle trafficking and infection by enveloped viruses. While the fusion protein assemblies that catalyze fusion are readily identifiable, the specific activities of the proteins involved and nature of the membrane changes they induce remain unknown. Here, we use many atomic-resolution simulations of vesicle fusion to examine the molecular mechanisms for fusion in detail. We employ committor analysis for these million-atom vesicle fusion simulations to identify a transition state for fusion stalk formation. In our simulations, this transition state occurs when the bulk properties of each lipid bilayer remain in a lamellar state but a few hydrophobic tails bulge into the hydrophilic interface layer and make contact to nucleate a stalk. Additional simulations of influenza fusion peptides in lipid bilayers show that the peptides promote similar local protrusion of lipid tails. Comparing these two sets of simulations, we obtain a common set of structural changes between the transition state for stalk formation and the local environment of peptides known to catalyze fusion. Our results thus suggest that the specific molecular properties of individual lipids are highly important to vesicle fusion and yield an explicit structural model that could help explain the mechanism of catalysis by fusion proteins.

    View details for DOI 10.1371/journal.pcbi.1000829

    View details for Web of Science ID 000279341000035

    View details for PubMedID 20585620

  • SCISSORS: A Linear-Algebraical Technique to Rapidly Approximate Chemical Similarities JOURNAL OF CHEMICAL INFORMATION AND MODELING Haque, I. S., Pande, V. S. 2010; 50 (6): 1075-1088


    Algorithms for several emerging large-scale problems in cheminformatics have as their rate-limiting step the evaluation of relatively slow chemical similarity measures, such as structural similarity or three-dimensional (3-D) shape comparison. In this article we present SCISSORS, a linear-algebraical technique (related to multidimensional scaling and kernel principal components analysis) to rapidly estimate chemical similarities for several popular measures. We demonstrate that SCISSORS faithfully reflects its source similarity measures for both Tanimoto calculation and rank ordering. After an efficient precalculation step on a database, SCISSORS affords several orders of magnitude of speedup in database screening. SCISSORS furthermore provides an asymptotic speedup for large similarity matrix construction problems, reducing the number of conventional slow similarity evaluations required from quadratic to linear scaling.

    View details for DOI 10.1021/ci1000136

    View details for Web of Science ID 000279069800011

    View details for PubMedID 20509629

  • Efficient Nonbonded Interactions for Molecular Dynamics on a Graphics Processing Unit JOURNAL OF COMPUTATIONAL CHEMISTRY Eastman, P., Pande, V. S. 2010; 31 (6): 1268-1272


    We describe an algorithm for computing nonbonded interactions with cutoffs on a graphics processing unit. We have incorporated it into OpenMM, a library for performing molecular simulations on high-performance computer architectures. We benchmark it on a variety of systems including boxes of water molecules, proteins in explicit solvent, a lipid bilayer, and proteins with implicit solvent. The results demonstrate that its performance scales linearly with the number of atoms over a wide range of system sizes, while being significantly faster than other published algorithms.

    View details for DOI 10.1002/jcc.21413

    View details for Web of Science ID 000276653600017

    View details for PubMedID 19847780

  • Unfolded-State Dynamics and Structure of Protein L Characterized by Simulation and Experiment JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Voelz, V. A., Singh, V. R., Wedemeyer, W. J., Lapidus, L. J., Pande, V. S. 2010; 132 (13): 4702-4709


    While several experimental techniques now exist for characterizing protein unfolded states, all-atom simulation of unfolded states has been challenging due to the long time scales and conformational sampling required. We address this problem by using a combination of accelerated calculations on graphics processor units and distributed computing to simulate tens of thousands of molecular dynamics trajectories each up to approximately 10 mus (for a total aggregate simulation time of 127 ms). We used this approach in conjunction with Trp-Cys contact quenching experiments to characterize the unfolded structure and dynamics of protein L. We employed a polymer theory method to make quantitative comparisons between high-temperature simulated and chemically denatured experimental ensembles and find that reaction-limited quenching rates calculated from simulation agree remarkably well with experiment. In both experiment and simulation, we find that unfolded-state intramolecular diffusion rates are very slow compared to highly denatured chains and that a single-residue mutation can significantly alter unfolded-state dynamics and structure. This work suggests a view of the unfolded state in which surprisingly low diffusion rates could limit folding and opens the door for all-atom molecular simulation to be a useful predictive tool for characterizing protein unfolded states along with experiments that directly measure intramolecular diffusion.

    View details for DOI 10.1021/ja908369h

    View details for Web of Science ID 000276553600053

    View details for PubMedID 20218718

  • Charge, hydrophobicity, and confined water: putting past simulations into a simple theoretical framework 52nd Annual Meeting of the Canadian-Society-of-Biochemistry-Molecular-and-Cellular-Biology England, J. L., Pande, V. S. CANADIAN SCIENCE PUBLISHING, NRC RESEARCH PRESS. 2010: 359–69


    Water permeates all life, and mediates forces that are essential to the process of macromolecular self-assembly. Predicting these forces in a given biological context is challenging, since water organizes itself differently next to charged and hydrophobic surfaces, both of which are typically at play on the nanoscale in vivo. In this work, we present a simple statistical mechanical model for the forces water mediates between different confining surfaces, and demonstrate that the model qualitatively unifies a wide range of phenomena known in the simulation literature, including several cases of protein folding under confinement.

    View details for DOI 10.1139/O09-187

    View details for Web of Science ID 000278015800022

    View details for PubMedID 20453936

  • SIML: A Fast SIMD Algorithm for Calculating LINGO Chemical Similarities on GPUs and CPUs JOURNAL OF CHEMICAL INFORMATION AND MODELING Haque, I. S., Pande, V. S., Walters, W. P. 2010; 50 (4): 560-564


    LINGOs are a holographic measure of chemical similarity based on text comparison of SMILES strings. We present a new algorithm for calculating LINGO similarities amenable to parallelization on SIMD architectures (such as GPUs and vector units of modern CPUs). We show that it is nearly 3x as fast as existing algorithms on a CPU, and over 80x faster than existing methods when run on a GPU.

    View details for DOI 10.1021/ci100011z

    View details for Web of Science ID 000276915200010

    View details for PubMedID 20218693

  • Current Status of the AMOEBA Polarizable Force Field JOURNAL OF PHYSICAL CHEMISTRY B Ponder, J. W., Wu, C., Ren, P., Pande, V. S., Chodera, J. D., Schnieders, M. J., Haque, I., Mobley, D. L., Lambrecht, D. S., Distasio, R. A., Head-Gordon, M., Clark, G. N., Johnson, M. E., Head-Gordon, T. 2010; 114 (8): 2549-2564


    Molecular force fields have been approaching a generational transition over the past several years, moving away from well-established and well-tuned, but intrinsically limited, fixed point charge models toward more intricate and expensive polarizable models that should allow more accurate description of molecular properties. The recently introduced AMOEBA force field is a leading publicly available example of this next generation of theoretical model, but to date, it has only received relatively limited validation, which we address here. We show that the AMOEBA force field is in fact a significant improvement over fixed charge models for small molecule structural and thermodynamic observables in particular, although further fine-tuning is necessary to describe solvation free energies of drug-like small molecules, dynamical properties away from ambient conditions, and possible improvements in aromatic interactions. State of the art electronic structure calculations reveal generally very good agreement with AMOEBA for demanding problems such as relative conformational energies of the alanine tetrapeptide and isomers of water sulfate complexes. AMOEBA is shown to be especially successful on protein-ligand binding and computational X-ray crystallography where polarization and accurate electrostatics are critical.

    View details for DOI 10.1021/jp910674d

    View details for Web of Science ID 000274842600001

    View details for PubMedID 20136072

  • Enhanced Modeling via Network Theory: Adaptive Sampling of Markov State Models JOURNAL OF CHEMICAL THEORY AND COMPUTATION Bowman, G. R., Ensign, D. L., Pande, V. S. 2010; 6 (3): 787-794

    View details for DOI 10.1021/ct900620b

    View details for Web of Science ID 000275189400018

  • Molecular Simulation of ab Initio Protein Folding for a Millisecond Folder NTL9(1-39) JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Voelz, V. A., Bowman, G. R., Beauchamp, K., Pande, V. S. 2010; 132 (5): 1526-?


    To date, the slowest-folding proteins folded ab initio by all-atom molecular dynamics simulations have had folding times in the range of nanoseconds to microseconds. We report simulations of several folding trajectories of NTL9(1-39), a protein which has a folding time of approximately 1.5 ms. Distributed molecular dynamics simulations in implicit solvent on GPU processors were used to generate ensembles of trajectories out to approximately 40 micros for several temperatures and starting states. At a temperature less than the melting point of the force field, we observe a small number of productive folding events, consistent with predictions from a model of parallel uncoupled two-state simulations. The posterior distribution of the folding rate predicted from the data agrees well with the experimental folding rate (approximately 640/s). Markov State Models (MSMs) built from the data show a gap in the implied time scales indicative of two-state folding and heterogeneous pathways connecting diffuse mesoscopic substates. Structural analysis of the 14 out of 2000 macrostates transited by the top 10 folding pathways reveals that native-like pairing between strands 1 and 2 only occurs for macrostates with p(fold) > 0.5, suggesting beta(12) hairpin formation may be rate-limiting. We believe that using simulation data such as these to seed adaptive resampling simulations will be a promising new method for achieving statistically converged descriptions of folding landscapes at longer time scales than ever before.

    View details for DOI 10.1021/ja9090353

    View details for Web of Science ID 000275084900039

    View details for PubMedID 20070076

  • CCMA: A Robust, Parallelizable Constraint Method for Molecular Simulations. Journal of chemical theory and computation Eastman, P., Pande, V. S. 2010; 6 (2): 434-437


    We introduce a new algorithm, the Constant Constraint Matrix Approximation (CCMA), for constraining distances in molecular simulations. It combines the best features of many existing algorithms while avoiding their defects: it is fast, stable, can be applied to arbitrary constraint topologies, and can be efficiently implemented on modern parallel architectures. We test it on a protein with bond length and limited angle constraints, and find that it requires less than one sixth as many iterations as SHAKE to converge.

    View details for PubMedID 20563234

  • PAPER-Accelerating Parallel Evaluations of ROCS JOURNAL OF COMPUTATIONAL CHEMISTRY Haque, I. S., Pande, V. S. 2010; 31 (1): 117-132


    Modern graphics processing units (GPUs) are flexibly programmable and have peak computational throughput significantly faster than conventional CPUs. Herein, we describe the design and implementation of PAPER, an open-source implementation of Gaussian molecular shape overlay for NVIDIA GPUs. We demonstrate one to two order-of-magnitude speedups on high-end commodity GPU hardware relative to a reference CPU implementation of the shape overlay algorithm and speedups of over one order of magnitude relative to the commercial OpenEye ROCS package. In addition, we describe errors incurred by approximations used in common implementations of the algorithm.

    View details for DOI 10.1002/jcc.21307

    View details for Web of Science ID 000273186800010

    View details for PubMedID 19421991

  • Bayesian Detection of Intensity Changes in Single Molecule and Molecular Dynamics Trajectories JOURNAL OF PHYSICAL CHEMISTRY B Ensign, D. L., Pande, V. S. 2010; 114 (1): 280-292


    Single molecule spectroscopy experiments and molecular dynamics simulations have several profound features in common, chief among which is that both follow the dynamics of some degrees of freedom of a single molecule over time. The analysis is essentially the same: one investigates the changes in the degrees of freedom followed. For instance, in a single molecule fluorescence experiment, the degree of freedom is often the number of photons detected in some time period. In this article, we introduce a straightforward Bayesian method for detecting if and when changes occurred. In contrast to methods based upon maximum likelihood estimates, a Bayesian approach allows for a more systematic means not only to change point detection but also to cluster the data into states. Most importantly, the Bayesian method supplies a simpler hypothesis testing framework. Although we focus on Poisson-distributed data, the Bayesian methods outlined here can in principle be applied to data sampled from any distribution.

    View details for DOI 10.1021/jp906786b

    View details for Web of Science ID 000273404500033

    View details for PubMedID 20000829

  • "Cross-graining": efficient multi-scale simulation via Markov state models. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Kasson, P. M., Pande, V. S. 2010: 260-268


    Accurate and efficient methods to simulate biomolecular systems at multiple levels of detail simultaneously are an ongoing challenge for the simulation community. Here we present a new method for multi-scale simulation where a complex system can be partitioned into two loosely-coupled sub-systems, one coarse-grained and one atomistic. If the coupling between the coarse-grained and atomistic systems can be encoded into discrete states that interconvert slowly, we can construct a Markov state model where we approximate any given transition P[(s(i),t(j))->(s(k),t(1))] in the joint space of the coarse-grained and atomistic systems as the product of two orthogonal transitions P(s(i)->s(k) mid R: t(j)) and P(t(j)->t(1) mid R: s(j)). We provide a formalism for constructing such models and describe how they may be applied to multi-scale simulation of membrane proteins. This "cross-graining" methodology may provide a general means to efficiently simulate mixed-scale systems.

    View details for PubMedID 19908378

  • Constructing multi-resolution Markov State Models (MSMs) to elucidate RNA hairpin folding mechanisms. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Huang, X., Yao, Y., Bowman, G. R., Sun, J., Guibas, L. J., Carlsson, G., Pande, V. S. 2010: 228-239


    Simulating biologically relevant timescales at atomic resolution is a challenging task since typical atomistic simulations are at least two orders of magnitude shorter. Markov State Models (MSMs) provide one means of overcoming this gap without sacrificing atomic resolution by extracting long time dynamics from short simulations. MSMs coarse grain space by dividing conformational space into long-lived, or metastable, states. This is equivalent to coarse graining time by integrating out fast motions within metastable states. By varying the degree of coarse graining one can vary the resolution of an MSM; therefore, MSMs are inherently multi-resolution. Here we introduce a new algorithm Super-level-set Hierarchical Clustering (SHC), to our knowledge, the first algorithm focused on constructing MSMs at multiple resolutions. The key insight of this algorithm is to generate a set of super levels covering different density regions of phase space, then cluster each super level separately, and finally recombine this information into a single MSM. SHC is able to produce MSMs at different resolutions using different super density level sets. To demonstrate the power of this algorithm we apply it to a small RNA hairpin, generating MSMs at four different resolutions. We validate these MSMs by showing that they are able to reproduce the original simulation data. Furthermore, long time folding dynamics are extracted from these models. The results show that there are no metastable on-pathway intermediate states. Instead, the folded state serves as a hub directly connected to multiple unfolded/misfolded states which are separated from each other by large free energy barriers.

    View details for PubMedID 19908375

  • Multiscale dynamics of macromolecules using normal mode Langevin. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Izaguirre, J. A., Sweet, C. R., Pande, V. S. 2010: 240-251


    Proteins and other macromolecules have coupled dynamics over multiple time scales (from femtosecond to millisecond and beyond) that make resolving molecular dynamics challenging. We present an approach based on periodically decomposing the dynamics of a macromolecule into slow and fast modes based on a scalable coarse-grained normal mode analysis. A Langevin equation is used to propagate the slowest degrees of freedom while minimizing the nearly instantaneous degrees of freedom. We present numerical results showing that time steps of up to 1000 fs can be used, with real speedups of up to 200 times over plain molecular dynamics. We present results of successfully folding the Fip35 mutant of WW domain.

    View details for PubMedID 19908376

  • Do conformational biases of simple helical junctions influence RNA folding stability and specificity? RNA-A PUBLICATION OF THE RNA SOCIETY Chu, V. B., Lipfert, J., Bai, Y., Pande, V. S., Doniach, S., Herschlag, D. 2009; 15 (12): 2195-2205


    Structured RNAs must fold into their native structures and discriminate against a large number of alternative ones, an especially difficult task given the limited information content of RNA's nucleotide alphabet. The simplest motifs within structured RNAs are two helices joined by nonhelical junctions. To uncover the fundamental behavior of these motifs and to elucidate the underlying physical forces and challenges faced by structured RNAs, we computationally and experimentally studied a tethered duplex model system composed of two helices joined by flexible single- or double-stranded polyethylene glycol tethers, whose lengths correspond to those typically observed in junctions from structured RNAs. To dissect the thermodynamic properties of these simple motifs, we computationally probed how junction topology, electrostatics, and tertiary contact location influenced folding stability. Small-angle X-ray scattering was used to assess our predictions. Single- or double-stranded junctions, independent of sequence, greatly reduce the space of allowed helical conformations and influencing the preferred location and orientation of their adjoining helices. A double-stranded junction guides the helices along a hinge-like pathway. In contrast, a single-stranded junction samples a broader set of conformations and has different preferences than the double-stranded junction. In turn, these preferences determine the stability and distinct specificities of tertiary structure formation. These sequence-independent effects suggest that properties as simple as a junction's topology can generally define the accessible conformational space, thereby stabilizing desired structures and assisting in discriminating against misfolded structures. Thus, junction topology provides a fundamental strategy for transcending the limitations imposed by the low information content of RNA primary sequence.

    View details for DOI 10.1261/rna.1747509

    View details for Web of Science ID 000272169000011

    View details for PubMedID 19850914

  • Rapid equilibrium sampling initiated from nonequilibrium data PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Huang, X., Bowman, G. R., Bacallado, S., Pande, V. S. 2009; 106 (47): 19765-19769


    Simulating the conformational dynamics of biomolecules is extremely difficult due to the rugged nature of their free energy landscapes and multiple long-lived, or metastable, states. Generalized ensemble (GE) algorithms, which have become popular in recent years, attempt to facilitate crossing between states at low temperatures by inducing a random walk in temperature space. Enthalpic barriers may be crossed more easily at high temperatures; however, entropic barriers will become more significant. This poses a problem because the dominant barriers to conformational change are entropic for many biological systems, such as the short RNA hairpin studied here. We present a new efficient algorithm for conformational sampling, called the adaptive seeding method (ASM), which uses nonequilibrium GE simulations to identify the metastable states, and seeds short simulations at constant temperature from each of them to quantitatively determine their equilibrium populations. Thus, the ASM takes advantage of the broad sampling possible with GE algorithms but generally crosses entropic barriers more efficiently during the seeding simulations at low temperature. We show that only local equilibrium is necessary for ASM, so very short seeding simulations may be used. Moreover, the ASM may be used to recover equilibrium properties from existing datasets that failed to converge, and is well suited to running on modern computer clusters.

    View details for DOI 10.1073/pnas.0909088106

    View details for Web of Science ID 000272180900007

    View details for PubMedID 19805023

  • Using generalized ensemble simulations and Markov state models to identify conformational states METHODS Bowman, G. R., Huang, X., Pande, V. S. 2009; 49 (2): 197-201


    Part of understanding a molecule's conformational dynamics is mapping out the dominant metastable, or long lived, states that it occupies. Once identified, the rates for transitioning between these states may then be determined in order to create a complete model of the system's conformational dynamics. Here we describe the use of the MSMBuilder package (now available at to build Markov State Models (MSMs) to identify the metastable states from Generalized Ensemble (GE) simulations, as well as other simulation datasets. Besides building MSMs, the code also includes tools for model evaluation and visualization.

    View details for DOI 10.1016/j.ymeth.2009.04.013

    View details for Web of Science ID 000270443600014

    View details for PubMedID 19410002

  • Progress and challenges in the automated construction of Markov state models for full protein systems JOURNAL OF CHEMICAL PHYSICS Bowman, G. R., Beauchamp, K. A., Boxer, G., Pande, V. S. 2009; 131 (12)


    Markov state models (MSMs) are a powerful tool for modeling both the thermodynamics and kinetics of molecular systems. In addition, they provide a rigorous means to combine information from multiple sources into a single model and to direct future simulations/experiments to minimize uncertainties in the model. However, constructing MSMs is challenging because doing so requires decomposing the extremely high dimensional and rugged free energy landscape of a molecular system into long-lived states, also called metastable states. Thus, their application has generally required significant chemical intuition and hand-tuning. To address this limitation we have developed a toolkit for automating the construction of MSMs called MSMBUILDER (available at In this work we demonstrate the application of MSMBUILDER to the villin headpiece (HP-35 NleNle), one of the smallest and fastest folding proteins. We show that the resulting MSM captures both the thermodynamics and kinetics of the original molecular dynamics of the system. As a first step toward experimental validation of our methodology we show that our model provides accurate structure prediction and that the longest timescale events correspond to folding.

    View details for DOI 10.1063/1.3216567

    View details for Web of Science ID 000270380300005

    View details for PubMedID 19791846

  • Bayesian Single-Exponential Kinetics in Single-Molecule Experiments and Simulations JOURNAL OF PHYSICAL CHEMISTRY B Ensign, D. L., Pande, V. S. 2009; 113 (36): 12410-12423


    In this work, we develop a fully Bayesian method for the calculation of probability distributions of single-exponential rates for any single-molecule process. These distributions can even be derived when no transitions from one state to another have been observed, since in that case the data can be used to estimate a lower bound on the rate. Using a Bayesian hypothesis test, one can easily test whether a transition occurs at the same rate or at different rates in two data sets. We illustrate these methods with molecular dynamics simulations of the folding of a beta-sheet protein. However, the theory presented here can be used on any data from simulation or experiment for which a two-state description is appropriate.

    View details for DOI 10.1021/jp903107c

    View details for Web of Science ID 000269655300032

    View details for PubMedID 19681587

  • Polarizable atomic multipole X-ray refinement: application to peptide crystals ACTA CRYSTALLOGRAPHICA SECTION D-BIOLOGICAL CRYSTALLOGRAPHY Schnieders, M. J., Fenn, T. D., Pande, V. S., Brunger, A. T. 2009; 65: 952-965


    Recent advances in computational chemistry have produced force fields based on a polarizable atomic multipole description of biomolecular electrostatics. In this work, the Atomic Multipole Optimized Energetics for Biomolecular Applications (AMOEBA) force field is applied to restrained refinement of molecular models against X-ray diffraction data from peptide crystals. A new formalism is also developed to compute anisotropic and aspherical structure factors using fast Fourier transformation (FFT) of Cartesian Gaussian multipoles. Relative to direct summation, the FFT approach can give a speedup of more than an order of magnitude for aspherical refinement of ultrahigh-resolution data sets. Use of a sublattice formalism makes the method highly parallelizable. Application of the Cartesian Gaussian multipole scattering model to a series of four peptide crystals using multipole coefficients from the AMOEBA force field demonstrates that AMOEBA systematically underestimates electron density at bond centers. For the trigonal and tetrahedral bonding geometries common in organic chemistry, an atomic multipole expansion through hexadecapole order is required to explain bond electron density. Alternatively, the addition of interatomic scattering (IAS) sites to the AMOEBA-based density captured bonding effects with fewer parameters. For a series of four peptide crystals, the AMOEBA-IAS model lowered R(free) by 20-40% relative to the original spherically symmetric scattering model.

    View details for DOI 10.1107/S0907444909022707

    View details for Web of Science ID 000269350000009

    View details for PubMedID 19690373

  • Combining Molecular Dynamics with Bayesian Analysis To Predict and Evaluate Ligand-Binding Mutations in Influenza Hemagglutinin JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Kasson, P. M., Ensign, D. L., Pande, V. S. 2009; 131 (32): 11338-?


    Influenza virus attaches to and infects target cells via binding of cell-surface glycans by the viral hemagglutinin. This binding specificity is considered a major reason why avian influenza is typically poorly transmitted between humans, while swine influenza is better transmitted due to glycan similarity between the human and swine upper respiratory tract. Predicting mutations that control glycan binding is thus important to continued surveillance against new pandemic influenza strains. We have designed a molecular-dynamics approach for scoring potential mutants with predictive power for both receptor-binding-domain and allosteric mutations similar to those identified from clinical isolates of avian influenza. We have performed thousands of simulations of 17 different hemagglutinin mutants totaling >1 ms in length and employ a bayesian model to rank mutations that disrupt the stability of the hemagglutinin-ligand complex. Based on our simulations, we predict a significantly increased k(off) for seven of these mutants. This means of using molecular dynamics analysis to make experimentally verifiable predictions offers a potentially general method to identify ligand-binding mutants, particularly allosteric ones. Our analysis of ligand dissociation provides a means to evaluate mutants prior to experimental mutagenesis and testing and constitutes an important step toward understanding the determinants of ligand binding by H5N1 influenza.

    View details for DOI 10.1021/ja904557w

    View details for Web of Science ID 000269379200031

    View details for PubMedID 19637916

  • Bayesian comparison of Markov models of molecular dynamics with detailed balance constraint JOURNAL OF CHEMICAL PHYSICS Bacallado, S., Chodera, J. D., Pande, V. 2009; 131 (4)


    Discrete-space Markov models are a convenient way of describing the kinetics of biomolecules. The most common strategies used to validate these models employ statistics from simulation data, such as the eigenvalue spectrum of the inferred rate matrix, which are often associated with large uncertainties. Here, we propose a Bayesian approach, which makes it possible to differentiate between models at a fixed lag time making use of short trajectories. The hierarchical definition of the models allows one to compare instances with any number of states. We apply a conjugate prior for reversible Markov chains, which was recently introduced in the statistics literature. The method is tested in two different systems, a Monte Carlo dynamics simulation of a two-dimensional model system and molecular dynamics simulations of the terminally blocked alanine dipeptide.

    View details for DOI 10.1063/1.3192309

    View details for Web of Science ID 000268613700098

    View details for PubMedID 19655927

  • The Roles of Entropy and Kinetics in Structure Prediction PLOS ONE Bowman, G. R., Pande, V. S. 2009; 4 (6)


    Here we continue our efforts to use methods developed in the folding mechanism community to both better understand and improve structure prediction. Our previous work demonstrated that Rosetta's coarse-grained potentials may actually impede accurate structure prediction at full-atom resolution. Based on this work we postulated that it may be time to work completely at full-atom resolution but that doing so may require more careful attention to the kinetics of convergence.To explore the possibility of working entirely at full-atom resolution, we apply enhanced sampling algorithms and the free energy theory developed in the folding mechanism community to full-atom protein structure prediction with the prominent Rosetta package. We find that Rosetta's full-atom scoring function is indeed able to recognize diverse protein native states and that there is a strong correlation between score and Calpha RMSD to the native state. However, we also show that there is a huge entropic barrier to folding under this potential and the kinetics of folding are extremely slow. We then exploit this new understanding to suggest ways to improve structure prediction.Based on this work we hypothesize that structure prediction may be improved by taking a more physical approach, i.e. considering the nature of the model thermodynamics and kinetics which result from structure prediction simulations.

    View details for DOI 10.1371/journal.pone.0005840

    View details for Web of Science ID 000267228800006

    View details for PubMedID 19513117

  • Comparison of computational approaches for predicting the effects of missense mutations on p53 function JOURNAL OF MOLECULAR GRAPHICS & MODELLING Chong, L. T., Pitera, J. W., Swope, W. C., Pande, V. S. 2009; 27 (8): 978-982


    We applied our recently developed kinetic computational mutagenesis (KCM) approach [L.T. Chong, W.C. Swope, J.W. Pitera, V.S. Pande, Kinetic computational alanine scanning: application to p53 oligomerization, J. Mol. Biol. 357 (3) (2006) 1039-1049] along with the MM-GBSA approach [J. Srinivasan, T.E. Cheatham 3rd, P. Cieplak, P.A. Kollman, D.A. Case, Continuum solvent studies of the stability of DNA, RNA, and phosphoramidate-DNA helices, J. Am. Chem. Soc. 120 (37) (1998) 9401-9409; P.A. Kollman, I. Massova, C.M. Reyes, B. Kuhn, S. Huo, L.T. Chong, M. Lee, T. Lee, Y. Duan, W. Wang, O. Donini, P. Cieplak, J. Srinivasan, D.A. Case, T.E. Cheatham 3rd., Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models, Acc. Chem. Res. 33 (12) (2000) 889-897] to evaluate the effects of all possible missense mutations on dimerization of the oligomerization domain (residues 326-355) of tumor suppressor p53. The true positive and true negative rates for KCM are comparable (within 5%) to those of MM-GBSA, although MM-GBSA is much less computationally intensive when it is applied to a single energy-minimized configuration per mutant dimer. The potential advantage of KCM is that it can be used to directly examine the kinetic effects of mutations.

    View details for DOI 10.1016/j.jmgm.2008.12.006

    View details for Web of Science ID 000267455400013

    View details for PubMedID 19168381

  • The Predicted Structure of the Headpiece of the Huntingtin Protein and Its Implications on Huntingtin Aggregation JOURNAL OF MOLECULAR BIOLOGY Kelley, N. W., Huang, X., Tam, S., Spiess, C., Frydman, J., Pande, V. S. 2009; 388 (5): 919-927


    We have performed simulated tempering molecular dynamics simulations to study the thermodynamics of the headpiece of the Huntingtin (Htt) protein (N17(Htt)). With converged sampling, we found this peptide is highly helical, as previously proposed. Interestingly, this peptide is also found to adopt two different and seemingly stable states. The region from residue 4 (L) to residue 9 (K) has a strong helicity from our simulations, which is supported by experimental studies. However, contrary to what was initially proposed, we have found that simulations predict the most populated state as a two-helix bundle rather than a single straight helix, although a significant percentage of structures do still adopt a single linear helix. The fact that Htt aggregation is nucleation dependent infers the importance of a critical transition. It has been shown that N17(Htt) is involved in this rate-limiting step. In this study, we propose two possible mechanisms for this nucleating event stemming from the transition between two-helix bundle state and single-helix state for N17(Htt) and the experimentally observed interactions between the N17(Htt) and polyQ domains. More strikingly, an extensive hydrophobic surface area is found to be exposed to solvent in the dominant monomeric state of N17(Htt). We propose the most fundamental role played by N17(Htt) would be initializing the dimerization and pulling the polyQ chains into adequate spatial proximity for the nucleation event to proceed.

    View details for DOI 10.1016/j.jmb.2009.01.032

    View details for Web of Science ID 000266121300001

    View details for PubMedID 19361448

  • Accelerating Molecular Dynamic Simulation on Graphics Processing Units JOURNAL OF COMPUTATIONAL CHEMISTRY Friedrichs, M. S., Eastman, P., Vaidyanathan, V., Houston, M., Legrand, S., Beberg, A. L., Ensign, D. L., Bruns, C. M., Pande, V. S. 2009; 30 (6): 864-872


    We describe a complete implementation of all-atom protein molecular dynamics running entirely on a graphics processing unit (GPU), including all standard force field terms, integration, constraints, and implicit solvent. We discuss the design of our algorithms and important optimizations needed to fully take advantage of a GPU. We evaluate its performance, and show that it can be more than 700 times faster than a conventional implementation running on a single CPU core.

    View details for DOI 10.1002/jcc.21209

    View details for Web of Science ID 000264651200003

    View details for PubMedID 19191337

  • The Fip35 WW Domain Folds with Structural and Mechanistic Heterogeneity in Molecular Dynamics Simulations BIOPHYSICAL JOURNAL Ensign, D. L., Pande, V. S. 2009; 96 (8): L53-L55


    We describe molecular dynamics simulations resulting in the folding the Fip35 Hpin1 WW domain. The simulations were run on a distributed set of graphics processors, which are capable of providing up to two orders of magnitude faster computation than conventional processors. Using the Folding@home distributed computing system, we generated thousands of independent trajectories in an implicit solvent model, totaling over 2.73 ms of simulations. A small number of these trajectories folded; the folding proceeded along several distinct routes and the system folded into two distinct three-stranded beta-sheet conformations, showing that the folding mechanism of this system is distinctly heterogeneous.

    View details for DOI 10.1016/j.bpj.2009.01.024

    View details for Web of Science ID 000266377100003

    View details for PubMedID 19383445

  • Topological methods for exploring low-density states in biomolecular folding pathways JOURNAL OF CHEMICAL PHYSICS Yao, Y., Sun, J., Huang, X., Bowman, G. R., Singh, G., Lesnick, M., Guibas, L. J., Pande, V. S., Carlsson, G. 2009; 130 (14)


    Characterization of transient intermediate or transition states is crucial for the description of biomolecular folding pathways, which is, however, difficult in both experiments and computer simulations. Such transient states are typically of low population in simulation samples. Even for simple systems such as RNA hairpins, recently there are mounting debates over the existence of multiple intermediate states. In this paper, we develop a computational approach to explore the relatively low populated transition or intermediate states in biomolecular folding pathways, based on a topological data analysis tool, MAPPER, with simulation data from large-scale distributed computing. The method is inspired by the classical Morse theory in mathematics which characterizes the topology of high-dimensional shapes via some functional level sets. In this paper we exploit a conditional density filter which enables us to focus on the structures on pathways, followed by clustering analysis on its level sets, which helps separate low populated intermediates from high populated folded/unfolded structures. A successful application of this method is given on a motivating example, a RNA hairpin with GCAA tetraloop, where we are able to provide structural evidence from computer simulations on the multiple intermediate states and exhibit different pictures about unfolding and refolding pathways. The method is effective in dealing with high degree of heterogeneity in distribution, capturing structural features in multiple pathways, and being less sensitive to the distance metric than nonlinear dimensionality reduction or geometric embedding methods. The methodology described in this paper admits various implementations or extensions to incorporate more information and adapt to different settings, which thus provides a systematic tool to explore the low-density intermediate states in complex biomolecular folding systems.

    View details for DOI 10.1063/1.3103496

    View details for Web of Science ID 000265617200017

    View details for PubMedID 19368437

  • Probing the Nanosecond Dynamics of a Designed Three-Stranded Beta-Sheet with a Massively Parallel Molecular Dynamics Simulation INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES Voelz, V. A., Luttmann, E., Bowman, G. R., Pande, V. S. 2009; 10 (3): 1013-1030


    Recently a temperature-jump FTIR study of a designed three-stranded sheet showing a fast relaxation time of approximately 140 +/- 20 ns was published. We performed massively parallel molecular dynamics simulations in explicit solvent to probe the structural events involved in this relaxation. While our simulations produce similar relaxation rates, the structural ensemble is broad. We observe the formation of turn structure, but only very weak interaction in the strand regions, which is consistent with the lack of strong backbone-backbone NOEs in previous structural NMR studies. These results suggest that either (D)P(D)P-II folds at time scales longer than 240 ns, or that (D)P(D)P-II is not a well-defined three-stranded beta-sheet. This work also provides an opportunity to compare the performance of several popular forcefield models against one another.

    View details for DOI 10.3390/ijms10031013

    View details for Web of Science ID 000264565300018

    View details for PubMedID 19399235

  • Inside the chaperonin toolbox: theoretical and computational models for chaperonin mechanism PHYSICAL BIOLOGY Lucent, D., England, J., Pande, V. 2009; 6 (1)


    Despite their immense importance to cellular function, the precise mechanism by which chaperonins aid in the folding of other proteins remains unknown. Experimental evidence seems to imply that there is some diversity in how chaperonins interact with their substrates and this has led to a number of different models for chaperonin mechanism. Computational methods have the advantage of accessing temporal and spatial resolutions that are difficult for experimental techniques; therefore, these methods have been applied to this problem for some time. Here we review the relevant computational models for chaperonin function. We propose that these models need not be mutually exclusive and in fact can be thought of as a set of tools the chaperonin may use to aid in the folding of a diverse array of substrate proteins. We conclude with a discussion of the role of water in the chaperonin mechanism, a factor that until recently has been largely neglected by most computational studies of chaperonin function.

    View details for DOI 10.1088/1478-3975/6/1/015003

    View details for Web of Science ID 000266148700005

    View details for PubMedID 19208937

  • Simulated tempering yields insight into the low-resolution Rosetta scoring functions PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS Bowman, G. R., Pande, V. S. 2009; 74 (3): 777-788


    Rosetta is a structure prediction package that has been employed successfully in numerous protein design and other applications.1 Previous reports have attributed the current limitations of the Rosetta de novo structure prediction algorithm to inadequate sampling, particularly during the low-resolution phase.2-5 Here, we implement the Simulated Tempering (ST) sampling algorithm67 in Rosetta to address this issue. ST is intended to yield canonical sampling by inducing a random walk in temperatures space such that broad sampling is achieved at high temperatures and detailed exploration of local free energy minima is achieved at low temperatures. ST should therefore visit basins in accordance with their free energies rather than their energies and achieve more global sampling than the localized scheme currently implemented in Rosetta. However, we find that ST does not improve structure prediction with Rosetta. To understand why, we carried out a detailed analysis of the low-resolution scoring functions and find that they do not provide a strong bias towards the native state. In addition, we find that both ST and standard Rosetta runs started from the native state are biased away from the native state. Although the low-resolution scoring functions could be improved, we propose that working entirely at full-atom resolution is now possible and may be a better option due to superior native-state discrimination at full-atom resolution. Such an approach will require more attention to the kinetics of convergence, however, as functions capable of native state discrimination are not necessarily capable of rapidly guiding non-native conformations to the native state.

    View details for DOI 10.1002/prot.22210

    View details for Web of Science ID 000262566700018

    View details for PubMedID 18767152

  • Accelerating Molecular Dynamic Simulation on the Cell Processor and PlayStation 3 JOURNAL OF COMPUTATIONAL CHEMISTRY Luttmann, E., Ensign, D. L., Vaidyanathan, V., Houston, M., Rimon, N., Oland, J., Jayachandran, G., Friedrichs, M., Pande, V. S. 2009; 30 (2): 268-274


    Implementation of molecular dynamics (MD) calculations on novel architectures will vastly increase its power to calculate the physical properties of complex systems. Herein, we detail algorithmic advances developed to accelerate MD simulations on the Cell processor, a commodity processor found in PlayStation 3 (PS3). In particular, we discuss issues regarding memory access versus computation and the types of calculations which are best suited for streaming processors such as the Cell, focusing on implicit solvation models. We conclude with a comparison of improved performance on the PS3's Cell processor over more traditional processors.

    View details for DOI 10.1002/jcc.21054

    View details for Web of Science ID 000262198400010

    View details for PubMedID 18615421

  • Assessment of the protein-structure refinement category in CASP8 PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS MacCallum, J. L., Hua, L., Schnieders, M. J., Pande, V. S., Jacobson, M. P., Dill, K. A. 2009; 77: 66-80


    Here, we summarize the assessment of protein structure refinement in CASP8. Twenty-four groups refined a total of 12 target proteins. Averaging over all groups and all proteins, there was no net improvement over the original starting models. However, there are now some individual research groups who consistently do improve protein structures relative to a starting starting model. We compare various measures of quality assessment, including (i) standard backbone-based methods, (ii) new methods from the Richardson group, and (iii) ensemble-based methods for comparing experimental structures, such as NMR NOE violations and the suitability of the predicted models to serve as templates for molecular replacement. On the whole, there is a general correlation among various measures. However, there are interesting differences. Sometimes a structure that is in better agreement with the experimental data is judged to be slightly worse by GDT-TS. This suggests that for comparing protein structures that are already quite close to the native, it may be preferable to use ensemble-based experimentally derived measures of quality, in addition to single-structure-based methods such as GDT-TS.

    View details for DOI 10.1002/prot.22538

    View details for Web of Science ID 000272244700007

    View details for PubMedID 19714776

  • Thalweg: A Framework For Programming 1,000 Machines With 1,000 Cores 23rd IEEE International Parallel and Distributed Processing Symposium Beberg, A. L., Pande, V. S. IEEE. 2009: 2384–2390
  • Folding@home: Lessons From Eight Years of Volunteer Distributed Computing 23rd IEEE International Parallel and Distributed Processing Symposium Beberg, A. L., Ensign, D. L., Jayachandran, G., Khaliq, S., Pande, V. S. IEEE. 2009: 1624–1631


    We present a new multiscale method that combines all-atom molecular dynamics with coarse-grained sampling, towards the aim of bridging two levels of physiology: the atomic scale of protein side chains and small molecules, and the huge scale of macromolecular complexes like the ribosome. Our approach uses all-atom simulations of peptide (or other ligand) fragments to calculate local 3D spatial potentials of mean force (PMF). The individual fragment PMFs are then used as a potential for a coarse-grained chain representation of the entire molecule. Conformational space and sequence space are sampled efficiently using generalized ensemble Monte Carlo. Here, we apply this method to the study of nascent polypeptides inside the cavity of the ribosome exit tunnel. We show how the method can be used to explore the accessible conformational and sequence space of nascent polypeptide chains near the ribosome peptidyl transfer center (PTC), with the eventual aim of understanding the basis of specificity for co-translational regulation. The method has many potential applications to predicting binding specificity and design, and is sufficiently general to allow even greater separation of scales in future work.

    View details for Web of Science ID 000263639700032

    View details for PubMedID 19209713



    Influenza hemagglutinin mediates both cell-surface binding and cell entry by the virus. Mutations to hemagglutinin are thus critical in determining host species specificity and viral infectivity. Previous approaches have primarily considered point mutations and sequence conservation; here we develop a complementary approach using mutual information to examine concerted mutations. For hemagglutinin, several overlapping selective pressures can cause such concerted mutations, including the host immune response, ligand recognition and host specificity, and functional requirements for pH-induced activation and membrane fusion. Using sequence mutual information as a metric, we extracted clusters of concerted mutation sites and analyzed them in the context of crystallographic data. Comparison of influenza isolates from two subtypes--human H3N2 strains and human and avian H5N1 strains--yielded substantial differences in spatial localization of the clustered residues. We hypothesize that the clusters on the globular head of H3N2 hemagglutinin may relate to antibody recognition (as many protective antibodies are known to bind in that region), while the clusters in common to H3N2 and H5N1 hemagglutinin may indicate shared functional roles. We propose that these shared sites may be particularly fruitful for mutagenesis studies in understanding the infectivity of this common human pathogen. The combination of sequence mutual information and structural analysis thus helps generate novel functional hypotheses that would not be apparent via either method alone.

    View details for Web of Science ID 000263639700047

    View details for PubMedID 19209725

  • Comparison of efficiency and bias of free energies computed by exponential averaging, the Bennett acceptance ratio, and thermodynamic integration (vol 122, art no 144107, 2005) JOURNAL OF CHEMICAL PHYSICS Shirts, M. R., Pande, V. S. 2008; 129 (22)

    View details for DOI 10.1063/1.3033406

    View details for Web of Science ID 000261698300051

  • Simulating oligomerization at experimental concentrations and long timescales: A Markov state model approach JOURNAL OF CHEMICAL PHYSICS Kelley, N. W., Vishal, V., Krafft, G. A., Pande, V. S. 2008; 129 (21)


    Here, we present a novel computational approach for describing the formation of oligomeric assemblies at experimental concentrations and timescales. We propose an extension to the Markovian state model approach, where one includes low concentration oligomeric states analytically. This allows simulation on long timescales (seconds timescale) and at arbitrarily low concentrations (e.g., the micromolar concentrations found in experiments), while still using an all-atom model for protein and solvent. As a proof of concept, we apply this methodology to the oligomerization of an Abeta peptide fragment (Abeta(21-43)). Abeta oligomers are now widely recognized as the primary neurotoxic structures leading to Alzheimer's disease. Our computational methods predict that Abeta trimers form at micromolar concentrations in 10 ms, while tetramers form 1000 times more slowly. Moreover, the simulation results predict specific intermonomer contacts present in the oligomer ensemble as well as putative structures for small molecular weight oligomers. Based on our simulations and statistical models, we propose a novel mutation to stabilize the trimeric form of Abeta in an experimentally verifiable manner.

    View details for DOI 10.1063/1.3010881

    View details for Web of Science ID 000261430900039

    View details for PubMedID 19063575

  • Side-chain recognition and gating in the ribosome exit tunnel PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Petrone, P. M., Snow, C. D., Lucent, D., Pande, V. S. 2008; 105 (43): 16549-16554


    The ribosome is a large complex catalyst responsible for the synthesis of new proteins, an essential function for life. New proteins emerge from the ribosome through an exit tunnel as nascent polypeptide chains. Recent findings indicate that tunnel interactions with the nascent polypeptide chain might be relevant for the regulation of translation. However, the specific ribosomal structural features that mediate this process are unknown. Performing molecular dynamics simulations, we are studying the interactions between components of the ribosome exit tunnel and different chemical probes (specifically different amino acid side chains or monovalent inorganic ions). Our free-energy maps describe the physicochemical environment of the tunnel, revealing binding crevices and free-energy barriers for single amino acids and ions. Our simulations indicate that transport out of the tunnel could be different for diverse amino acid species. In addition, our results predict a notable protein-RNA interaction between a flexible 23S rRNA tetraloop (gate) and ribosomal protein L39 (latch) that could potentially obstruct the tunnel's exit. By relating our simulation data to earlier biochemical studies, we propose that ribosomal features at the exit of the tunnel can play a role in the regulation of nascent chain exit and ion flux. Moreover, our free-energy maps may provide a context for interpreting sequence-dependent nascent chain phenomenology.

    View details for DOI 10.1073/pnas.0801795105

    View details for Web of Science ID 000260913500028

    View details for PubMedID 18946046

  • Structural basis for influence of viral glycans on ligand binding by influenza hemagglutinin BIOPHYSICAL JOURNAL Kasson, P. M., Pande, V. S. 2008; 95 (7): L48-L50


    Binding of cell surface glycans by influenza hemagglutinin controls viral attachment and infection of host cells. This binding is a three-way interaction between viral proteins, host glycans, and viral glycans; many structural details of this interaction have been difficult to resolve. Here, we use a series of 100-ns molecular dynamics simulations to further analyze available crystallographic data on hemagglutinin-ligand interactions. Based on our simulations, we predict that the viral glycans contact the host glycans within 1-2 residues of the ligand-binding site. We also predict that the glycan-glycan interactions contain both stabilizing and destabilizing components. These predictions suggest a structural means to explain why changes to viral glycosylation alter the efficiency and selectivity of ligand binding. We also predict that the proximity of these interactions to the ligand-binding pocket will impact the binding affinity of small glycomimetic ligands analogous to the influenza neuraminidase inhibitors currently in clinical use.

    View details for DOI 10.1529/biophysj.108.141507

    View details for Web of Science ID 000259393200002

    View details for PubMedID 18641068

  • Potential for modulation of the hydrophobic effect inside chaperonins BIOPHYSICAL JOURNAL England, J. L., Pande, V. S. 2008; 95 (7): 3391-3399


    Despite the spontaneity of some in vitro protein-folding reactions, native folding in vivo often requires the participation of barrel-shaped multimeric complexes known as chaperonins. Although it has long been known that chaperonin substrates fold upon sequestration inside the chaperonin barrel, the precise mechanism by which confinement within this space facilitates folding remains unknown. We examine the possibility that the chaperonin mediates a favorable reorganization of the solvent for the folding reaction. We discuss the effect of electrostatic charge on solvent-mediated hydrophobic forces in an aqueous environment. Based on these physical arguments, we construct a simple, phenomenological theory for the thermodynamics of density and hydrogen-bond order fluctuations in liquid water. Within the framework of this model, we investigate the effect of confinement inside a chaperonin-like cavity on the configurational free energy of water by calculating solvent free energies for cavities corresponding to the different conformational states in the ATP-driven catalytic cycle of the prokaryotic chaperonin GroEL. Our findings suggest that one function of chaperonins may involve trapping unfolded proteins and subsequently exposing them to a microenvironment in which the hydrophobic effect, a crucial thermodynamic driving force for folding, is enhanced.

    View details for DOI 10.1529/biophysj.108.131037

    View details for Web of Science ID 000259393200031

    View details for PubMedID 18599630

  • Critical assessment of nucleic acid electrostatics via experimental and computational investigation of an unfolded state ensemble JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Bai, Y., Chu, V. B., Lipfert, J., Pande, V. S., Herschlag, D., Doniach, S. 2008; 130 (37): 12334-12341


    Electrostatic forces, acting between helices and modulated by the presence of the ion atmosphere, are key determinants in the energetic balance that governs RNA folding. Previous studies have employed Poisson-Boltzmann (PB) theory to compute the energetic contribution of these forces in RNA folding. However, the complex interaction of these electrostatic forces with RNA features such as tertiary contact formation, specific ion-binding, and complex interhelical junctions present in prior studies precluded a rigorous evaluation of PB theory, especially in physiologically important Mg(2+) solutions. To critically assess PB theory, we developed a model system that isolates these electrostatic forces. The model system, composed of two DNA duplexes tethered by a polyethylene glycol junction, is an analog for the unfolded state of canonical helix-junction-helix motifs found in virtually all structured RNAs. This model system lacks the complicating features that have precluded a critical assessment of PB in prior studies, ensuring that interhelical electrostatic forces dominate the behavior of the system. The system's simplicity allows PB predictions to be directly compared with small-angle X-ray scattering experiments over a range of monovalent and divalent ion concentrations. These comparisons indicate that PB is a reasonable description of the underlying electrostatic energies for monovalent ions, but large deviations are observed for divalent ions. The validation of PB for monovalent solutions allows analysis of the change in the conformational ensemble of this simple motif as salt concentration is changed. Addition of ions allows the motif to sample more compact microstates, increasing its conformational entropy. The increase of conformational entropy presents an additional barrier to folding by stabilizing the unfolded state. Neglecting this effect will adversely impact the accuracy of folding analyses and models.

    View details for DOI 10.1021/ja800854u

    View details for Web of Science ID 000259139900046

    View details for PubMedID 18722445

  • A role for confined water in chaperonin function JOURNAL OF THE AMERICAN CHEMICAL SOCIETY England, J. L., Lucent, D., Pande, V. S. 2008; 130 (36): 11838-11839


    Chaperonins engulf other proteins and accelerate their folding by an unknown mechanism. Here, we combine all-atom molecular dynamics simulations with data from experimental assays of the activity of the bacterial chaperonin GroEL to demonstrate that a chaperonin's ability to facilitate folding is correlated with the affinity of its interior surface for water. Our results suggest a novel view of the behavior of confined water for models of in vivo protein folding scenarios.

    View details for DOI 10.1021/ja802248m

    View details for Web of Science ID 000258950500002

    View details for PubMedID 18710231

  • Chemical denaturants inhibit the onset of dewetting JOURNAL OF THE AMERICAN CHEMICAL SOCIETY England, J. L., Pande, V. S., Haran, G. 2008; 130 (36): 11854-11855


    The mechanism by which the aqueous cosolvents guanidinium chloride and urea denature proteins is a matter of controversy. Here, we use all-atom molecular dynamics simulations to study the effect of both denaturants on the dewetting of water confined between nanoseparated hydrophobic plates. It is found that the denaturants inhibit the onset of dewetting, so that it occurs at shorter interplate distances than in pure water. Our results support a role for urea and guanidinium in assisting in the solvation of nonpolar surfaces, thereby weakening hydrophobic effects known to be important for protein stability.

    View details for DOI 10.1021/ja803972g

    View details for Web of Science ID 000258950500010

    View details for PubMedID 18707183

  • The Simbios National Center: Systems biology in motion PROCEEDINGS OF THE IEEE Schmidt, J. P., Delp, S. L., Sherman, M. A., Taylor, C. A., Pande, V. S., Altman, R. B. 2008; 96 (8): 1266-1280
  • Molecular simulation of multistate peptide dynamics: A comparison between microsecond timescale sampling and multiple shorter trajectories JOURNAL OF COMPUTATIONAL CHEMISTRY Monticelli, L., Sorin, E. J., Tieleman, D. P., Pande, V. S., Colombo, G. 2008; 29 (11): 1740-1752


    Molecular dynamics simulations of the RN24 peptide, which includes a diverse set of structurally heterogeneous states, are carried out in explicit solvent. Two approaches are employed and compared directly under identical simulation conditions. Specifically, we examine sampling by two individual long trajectories (microsecond timescale) and many shorter (MS) uncoupled trajectories. Statistical analysis of the structural properties indicates a qualitative agreement between these approaches. Microsecond timescale sampling gives large uncertainties on most structural metrics, while the shorter timescale of MS simulations results in slight structural memory for beta-structure starting states. Additionally, MS sampling detects numerous transitions on a relatively short timescale that are not observed in microsecond sampling, while long simulations allow for detection of a few transitions on significantly longer timescales. A correlation between the complex free energy landscape and the kinetics of the equilibrium is highlighted by principal component analysis on both simulation sets. This report highlights the increased precision of the MS approach when studying the kinetics of complex conformational change, while revealing the complementary insight and qualitative agreement offered by far fewer individual simulations on significantly longer timescales.

    View details for DOI 10.1002/jcc.20935

    View details for Web of Science ID 000258358100005

    View details for PubMedID 18307167

  • Structural insight into RNA hairpin folding intermediates JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Bowman, G. R., Huang, X., Yao, Y., Sun, J., Carlsson, G., Guibas, L. J., Pande, V. S. 2008; 130 (30): 9676-?


    Hairpins are a ubiquitous secondary structure motif in RNA molecules. Despite their simple structure, there is some debate over whether they fold in a two-state or multi-state manner. We have studied the folding of a small tetraloop hairpin using a serial version of replica exchange molecular dynamics on a distributed computing environment. On the basis of these simulations, we have identified a number of intermediates that are consistent with experimental results. We also find that folding is not simply the reverse of high-temperature unfolding and suggest that this may be a general feature of biomolecular folding.

    View details for DOI 10.1021/ja8032857

    View details for Web of Science ID 000257902500027

    View details for PubMedID 18593120

  • Convergence of folding free energy landscapes via application of enhanced sampling methods in a distributed computing environment JOURNAL OF CHEMICAL PHYSICS Huang, X., Bowman, G. R., Pande, V. S. 2008; 128 (20)


    We have implemented the serial replica exchange method (SREM) and simulated tempering (ST) enhanced sampling algorithms in a global distributed computing environment. Here we examine the helix-coil transition of a 21 residue alpha-helical peptide in explicit solvent. For ST, we demonstrate the efficacy of a new method for determining initial weights allowing the system to perform a random walk in temperature space based on short trial simulations. These weights are updated throughout the production simulation by an adaptive weighting method. We give a detailed comparison of SREM, ST, as well as standard MD and find that SREM and ST give equivalent results in reasonable agreement with experimental data. In addition, we find that both enhanced sampling methods are much more efficient than standard MD simulations. The melting temperature of the Fs peptide with the AMBER99phi potential was calculated to be about 310 K, which is in reasonable agreement with the experimental value of 334 K. We also discuss other temperature dependent properties of the helix-coil transition. Although ST has certain advantages over SREM, both SREM and ST are shown to be powerful methods via distributed computing and will be applied extensively in future studies of complex bimolecular systems.

    View details for DOI 10.1063/1.2908251

    View details for Web of Science ID 000256304200050

    View details for PubMedID 18513049

  • Solvent viscosity dependence of the protein folding dynamics JOURNAL OF PHYSICAL CHEMISTRY B Rhee, Y. M., Pande, V. S. 2008; 112 (19): 6221-6227


    Solvent viscosity has been frequently adopted as an adjustable parameter in various computational studies (e.g., protein folding simulations) with implicit solvent models. A common approach is to use low viscosities to expedite simulations. While using viscosities lower than that of aqueous is unphysical, such treatment is based on observations that the viscosity affects the kinetics (rates) in a well-defined manner as described by Kramers' theory. Here, we investigate the effect of viscosity on the detailed dynamics (mechanism) of protein folding. On the basis of a simple mathematical model, we first show that viscosity may indeed affect the dynamics in a complex way. By applying the model to the folding of a small protein, we demonstrate that the detailed dynamics is affected rather pronouncedly especially at unphysically low viscosities, cautioning against using such viscosities. In this regard, our model may also serve as a diagnostic tool for validating low-viscosity simulations. It is also suggested that the viscosity dependence can be further exploited to gain information about the protein folding mechanism.

    View details for DOI 10.1021/jp076301d

    View details for Web of Science ID 000255649600042

    View details for PubMedID 18229911

  • Effects of long-range electrostatic forces on simulated protein folding kinetics JOURNAL OF COMPUTATIONAL CHEMISTRY Robertson, A., Luttmann, E., Pande, V. S. 2008; 29 (5): 694-700


    Molecular dynamics simulations are a useful tool for characterizing protein folding pathways. There are several methods of treating electrostatic forces in these simulations with varying degrees of physical fidelity and computational efficiency. In this article, we compare the reaction field (RF) algorithm, particle-mesh Ewald (PME), and tapered cutoffs with increasing cutoff radii to address the impact of the electrostatics method employed on the folding kinetics. We quantitatively compare different methods by a correlation of quantitative measures of protein folding kinetics. The results of these comparisons show that for protein folding kinetics, the RF algorithm can quantitatively reproduce the kinetics of the more costly PME algorithm. These results not only assist the selection of appropriate algorithms for future simulations, but also give insight on the role that long-range electrostatic forces have in protein folding.

    View details for DOI 10.1002/jcc.20828

    View details for Web of Science ID 000254063300003

    View details for PubMedID 17849394

  • Normal mode partitioning of Langevin dynamics for biomolecules JOURNAL OF CHEMICAL PHYSICS Sweet, C. R., Petrone, P., Pande, V. S., Izaguirre, J. A. 2008; 128 (14)


    We propose a novel normal mode multiple time stepping Langevin dynamics integrator called NML. The aim is to approximate the kinetics or thermodynamics of a biomolecule by a reduced model based on a normal mode decomposition of the dynamical space. Our basis set uses the eigenvectors of a mass reweighted Hessian matrix calculated with a biomolecular force field. This particular choice has the advantage of an ordering according to the eigenvalues, which have a physical meaning of being the square of the mode frequency. Low frequency eigenvalues correspond to more collective motions, whereas the highest frequency eigenvalues are the limiting factor for the stability of the integrator. In NML, the higher frequency modes are overdamped and relaxed near their energy minimum while respecting the subspace of low frequency dynamical modes. Our numerical results confirm that both sampling and rates are conserved for an implicitly solvated alanine dipeptide model, with only 30% of the modes propagated, when compared to the full model. For implicitly solvated systems, NML gives a twofold improvement in efficiency over plain Langevin dynamics for sampling a small 22 atom (alanine dipeptide) model and in excess of an order of magnitude for sampling an 882 atom (bovine pancreatic trypsin inhibitor) model, with good scaling with system size subject to the number of modes propagated. NML has been implemented in the open source software PROTOMOL.

    View details for DOI 10.1063/1.2883966

    View details for Web of Science ID 000255470300063

    View details for PubMedID 18412479

  • Rattling the cage: computational models of chaperonin-mediated protein folding CURRENT OPINION IN STRUCTURAL BIOLOGY England, J., Lucent, D., Pande, V. 2008; 18 (2): 163-169


    Chaperonins are known to maintain the stability of the proteome by facilitating the productive folding of numerous misfolded or aggregation-prone proteins and are thus essential for cell viability. Despite their established importance, the mechanism by which chaperonins facilitate protein folding remains unknown. Computer simulation techniques are now being employed to complement experimental ones in order to shed light on this mystery. Here we review previous computational models of chaperonin-mediated protein folding in the context of the two main hypotheses for chaperonin function: iterative annealing and landscape modulation. We then discuss new results pointing to the importance of solvent (a previously neglected factor) in chaperonin activity. We conclude with our views on the future role of simulation in studying chaperonin activity as well as protein folding in other biologically relevant confined contexts.

    View details for DOI 10.1016/

    View details for Web of Science ID 000255800200006

    View details for PubMedID 18291636

  • Predicting small-molecule solvation free energies: An informal blind test for computational chemistry JOURNAL OF MEDICINAL CHEMISTRY Nicholls, A., Mobley, D. L., Guthrie, J. P., Chodera, J. D., Bayly, C. I., Cooper, M. D., Pande, V. S. 2008; 51 (4): 769-779


    Experimental data on the transfer of small molecules between vacuum and water are relatively sparse. This makes it difficult to assess whether computational methods are truly predictive of this important quantity or merely good at explaining what has been seen. To explore this, a prospective test was performed of two different methods for estimating solvation free energies: an implicit solvent approach based on the Poisson-Boltzmann equation and an explicit solvent approach using alchemical free energy calculations. For a set of 17 small molecules, root mean square errors from experiment were between 1.3 and 2.6 kcal/mol, with the explicit solvent free energy approach yielding somewhat greater accuracy but at greater computational expense. Insights from outliers and suggestions for future prospective challenges of this kind are presented.

    View details for DOI 10.1021/jm070549+

    View details for Web of Science ID 000253353800009

    View details for PubMedID 18215013

  • Theory for an order-driven disruption of the liquid state in water JOURNAL OF CHEMICAL PHYSICS England, J. L., Park, S., Pande, V. S. 2008; 128 (4)


    Water is known to exhibit a number of peculiar physical properties because of the strong orientational dependence of the intermolecular hydrogen bonding interactions that dominate its liquid state. Recent full-atom simulations of water in a nanolayer between graphite plates submersed in an aqueous medium have raised the possibility of a new addition to this list of peculiarities: they show that application of a strong, uniform electric field normal to and between the plates can cause a pronounced decrease in particle density, rather than the increase expected from electrostriction theory for polarizable fluids [Vaitheeswaran et al., J. Phys. Chem. B 70, 6629 (2005)]. However, in seeming contradiction to this result, another study that simulated a range of similar systems has reported a less surprising electrostrictive increase in particle density upon application of the field [Bratko et al., J. Am. Chem. Soc. 129, 2504 (2007)]. In this work, we attempt to reconcile these conflicting simulation phenomena using a statistical mechanical lattice liquid model of water in an applied field. By solving the model using mean-field theory, we show that a field-induced transition to a markedly lower-density phase such as that observed in recent simulations is possible within a certain parameter regime, but that outside of this regime, the more conventional electrostrictive result should be obtained. Upon modifying the model to treat the case of bulk water under constant pressure in an applied field, we predict a density drop with rising field, and subsequently observe the predicted behavior in our own molecular dynamics simulations of liquid water. Our findings lead us to propose that the model considered here may be useful in a variety of contexts for describing the trade-off between orientational ordering of water molecules and their participation in the liquid phase.

    View details for DOI 10.1063/1.2823129

    View details for Web of Science ID 000252821200046

    View details for PubMedID 18247965

  • The Simbios National Center: Systems Biology in Motion. Proceedings of the IEEE. Institute of Electrical and Electronics Engineers Schmidt, J. P., Delp, S. L., Sherman, M. A., Taylor, C. A., Pande, V. S., Altman, R. B. 2008; 96 (8): 1266


    Physics-based simulation is needed to understand the function of biological structures and can be applied across a wide range of scales, from molecules to organisms. Simbios (the National Center for Physics-Based Simulation of Biological Structures, is one of seven NIH-supported National Centers for Biomedical Computation. This article provides an overview of the mission and achievements of Simbios, and describes its place within systems biology. Understanding the interactions between various parts of a biological system and integrating this information to understand how biological systems function is the goal of systems biology. Many important biological systems comprise complex structural systems whose components interact through the exchange of physical forces, and whose movement and function is dictated by those forces. In particular, systems that are made of multiple identifiable components that move relative to one another in a constrained manner are multibody systems. Simbios' focus is creating methods for their simulation. Simbios is also investigating the biomechanical forces that govern fluid flow through deformable vessels, a central problem in cardiovascular dynamics. In this application, the system is governed by the interplay of classical forces, but the motion is distributed smoothly through the materials and fluids, requiring the use of continuum methods. In addition to the research aims, Simbios is working to disseminate information, software and other resources relevant to biological systems in motion.

    View details for DOI 10.1109/JPROC.2008.925454

    View details for PubMedID 20107615

    View details for PubMedCentralID PMC2811325

  • Statistical characterization of protein ensembles IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS Rother, D., Sapiro, G., Pande, V. 2008; 5 (1): 42-55


    When accounting for structural fluctuations or measurement errors, a single rigid structure may not be sufficient to represent a protein. One approach to solve this problem is to represent the possible conformations as a discrete set of observed conformations, an ensemble. In this work, we follow a different richer approach, and introduce a framework for estimating probability density functions in very high dimensions, and then apply it to represent ensembles of folded proteins. This proposed approach combines techniques such as kernel density estimation, maximum likelihood, cross-validation, and bootstrapping. We present the underlying theoretical and computational framework and apply it to artificial data and protein ensembles obtained from molecular dynamics simulations. We compare the results with those obtained experimentally, illustrating the potential and advantages of this representation.

    View details for DOI 10.1109/TCBB.2007.1061

    View details for Web of Science ID 000253417100004

    View details for PubMedID 18245874

  • Folding and misfolding of the collagen triple helix: Markov analysis of molecular dynamics simulations BIOPHYSICAL JOURNAL Park, S., Klein, T. E., Pande, V. S. 2007; 93 (12): 4108-4115


    Folding and misfolding of the collagen triple helix are studied through molecular dynamics simulations of two collagenlike peptides, [(POG)(10)](3) and [(POG)(4)POA(POG)(5)](3), which are models for wild-type and mutant collagen, respectively. To extract long time dynamics from short trajectories, we employ Markov state models. By analyzing thermodynamic and kinetic quantities calculated from the Markov state models, we examine folding mechanisms of the collagen triple helix and consequences of glycine mutations. We find that the C-to-N zipping of the collagen triple helix must be initiated by a nucleation event consisting of formation of three stable hydrogen bonds, and that zipping through a glycine mutation site requires a renucleation event which also consists of formation of three stable hydrogen bonds. Our results also suggest that slow kinetics, rather than free energy differences, is mainly responsible for the stability of the collagen triple helix.

    View details for DOI 10.1529/biophysj.107.108100

    View details for Web of Science ID 000251298100006

    View details for PubMedID 17766343

    View details for PubMedCentralID PMC2098736

  • Heterogeneity even at the speed limit of folding: Large-scale molecular dynamics study of a fast-folding variant of the villin headpiece JOURNAL OF MOLECULAR BIOLOGY Ensign, D. L., Kasson, P. M., Pande, V. S. 2007; 374 (3): 806-816


    We have performed molecular dynamics simulations on a set of nine unfolded conformations of the fastest-folding protein yet discovered, a variant of the villin headpiece subdomain (HP-35 NleNle). The simulations were generated using a new distributed computing method, yielding hundreds of trajectories each on a time scale comparable to the experimental folding time, despite the large (10,000 atom) size of the simulation system. This strategy eliminates the need to assume a two-state kinetic model or to build a Markov state model. The relaxation to the folded state at 300 K from the unfolded configurations (generated by simulation at 373 K) was monitored by a method intended to reflect the experimental observable (quenching of tryptophan by histidine). We also monitored the relaxation to the native state by directly comparing structural snapshots with the native state. The rate of relaxation to the native state and the number of resolvable kinetic time scales both depend upon starting structure. Moreover, starting structures with folding rates most similar to experiment show some native-like structure in the N-terminal helix (helix 1) and the phenylalanine residues constituting the hydrophobic core, suggesting that these elements may exist in the experimentally relevant unfolded state. Our large-scale simulation data reveal kinetic complexity not resolved in the experimental data. Based on these findings, we propose additional experiments to further probe the kinetics of villin folding.

    View details for DOI 10.1016/j.jmb.2007.09.069

    View details for Web of Science ID 000251403800017

    View details for PubMedID 17950314

  • Accurate and efficient corrections for missing dispersion interactions in molecular Simulations JOURNAL OF PHYSICAL CHEMISTRY B Shirts, M. R., Mobley, D. L., Chodera, J. D., Pande, V. S. 2007; 111 (45): 13052-13063


    In simulations, molecular dispersion interactions are frequently neglected beyond a cutoff of around 1 nm. In some cases, analytical corrections appropriate for isotropic systems are applied to the pressure and/or the potential energy. Here, we show that in systems containing macromolecules, either of these approaches introduce statistically significant errors in some observed properties; for example, the choice of cutoff can affect computed free energies of ligand binding to proteins by 1 to 2 kcal/mol. We review current methods for eliminating this cutoff-dependent behavior of the dispersion energy and identify some situations where they fail. We introduce two new formalisms, appropriate for binding free energy calculations, which overcome these failings, requiring minimal computational effort beyond the time required to run the original simulation. When these cutoff approximations are applied, which can be done after all simulations are completed, results are consistent across simulations run with different cutoffs. In many situations, simulations can be run with even shorter cutoffs than typically used, resulting in increased computational efficiency.

    View details for DOI 10.1021/jp0735987

    View details for Web of Science ID 000250809600017

    View details for PubMedID 17949030

  • Control of membrane fusion mechanism by lipid composition: Predictions from ensemble molecular dynamics PLOS COMPUTATIONAL BIOLOGY Kasson, P. M., Pande, V. S. 2007; 3 (11): 2228-2238


    Membrane fusion is critical to biological processes such as viral infection, endocrine hormone secretion, and neurotransmission, yet the precise mechanistic details of the fusion process remain unknown. Current experimental and computational model systems approximate the complex physiological membrane environment for fusion using one or a few protein and lipid species. Here, we report results of a computational model system for fusion in which the ratio of lipid components was systematically varied, using thousands of simulations of up to a microsecond in length to predict the effects of lipid composition on both fusion kinetics and mechanism. In our simulations, increased phosphatidylcholine content in vesicles causes increased activation energies for formation of the initial stalk-like intermediate for fusion and of hemifusion intermediates, in accordance with previous continuum-mechanics theoretical treatments. We also use our large simulation dataset to quantitatively compare the mechanism by which vesicles fuse at different lipid compositions, showing a significant difference in fusion kinetics and mechanism at different compositions simulated. As physiological membranes have different compositions in the inner and outer leaflets, we examine the effect of such asymmetry, as well as the effect of membrane curvature on fusion. These predicted effects of lipid composition on fusion mechanism both underscore the way in which experimental model system construction may affect the observed mechanism of fusion and illustrate a potential mechanism for cellular regulation of the fusion process by altering membrane composition.

    View details for DOI 10.1371/journal.pcbi.0030220

    View details for Web of Science ID 000251310000017

    View details for PubMedID 18020701

  • Persistent voids: a new structural metric for membrane fusion BIOINFORMATICS Kasson, P. M., Zornorodian, A., Park, S., Singhal, N., Guibas, L. J., Pande, V. S. 2007; 23 (14): 1753-1759


    Membrane fusion constitutes a key stage in cellular processes such as synaptic neurotransmission and infection by enveloped viruses. Current experimental assays for fusion have thus far been unable to resolve early fusion events in fine structural detail. We have previously used molecular dynamics simulations to develop mechanistic models of fusion by small lipid vesicles. Here, we introduce a novel structural measurement of vesicle topology and fusion geometry: persistent voids.Persistent voids calculations enable systematic measurement of structural changes in vesicle fusion by assessing fusion stalk widths. They also constitute a generally applicable technique for assessing lipid topological change. We use persistent voids to compute dynamic relationships between hemifusion neck widening and formation of a full fusion pore in our simulation data. We predict that a tightly coordinated process of hemifusion neck expansion and pore formation is responsible for the rapid vesicle fusion mechanism, while isolated enlargement of the hemifusion diaphragm leads to the formation of a metastable hemifused intermediate. These findings suggest that rapid fusion between small vesicles proceeds via a small hemifusion diaphragm rather than a fully expanded one.Software available upon request pending public release.Supplementary data are available on Bioinformatics online.

    View details for DOI 10.1093/bioinformatics/btm250

    View details for Web of Science ID 000249248300006

    View details for PubMedID 17488753

  • Choosing weights for simulated tempering PHYSICAL REVIEW E Park, S., Pande, V. S. 2007; 76 (1)


    Simulated tempering is a method to enhance simulations of complex systems by periodically raising and lowering the temperature. Despite its advantages, simulated tempering has been overshadowed by its parallel counterpart, replica exchange (also known as parallel tempering), due to the difficulty of weight determination in simulated tempering. Here we propose a simple and fast method to obtain near-optimal weights for simulated tempering, and demonstrate its effectiveness in a molecular dynamics simulation of Ala(10) polypeptide in explicit solvent. We believe simulated tempering now deserves another look.

    View details for DOI 10.1103/PhysRevE.76.016703

    View details for Web of Science ID 000248552600069

    View details for PubMedID 17677590

  • Calculation of the distribution of eigenvalues and eigenvectors in Markovian state models for molecular dynamics JOURNAL OF CHEMICAL PHYSICS Hinrichs, N. S., Pande, V. S. 2007; 126 (24)


    Markovian state models (MSMs) are a convenient and efficient means to compactly describe the kinetics of a molecular system as well as a formalism for using many short simulations to predict long time scale behavior. Building a MSM consists of grouping the conformations into states and estimating the transition probabilities between these states. In a previous paper, we described an efficient method for calculating the uncertainty due to finite sampling in the mean first passage time between two states. In this paper, we extend the uncertainty analysis to derive similar closed-form solutions for the distributions of the eigenvalues and eigenvectors of the transition matrix, quantities that have numerous applications when using the model. We demonstrate the accuracy of the distributions on a six-state model of the terminally blocked alanine peptide. We also show how to significantly reduce the total number of simulations necessary to build a model with a given precision using these uncertainty estimates for the blocked alanine system and for a 2454-state MSM for the dynamics of the villin headpiece.

    View details for DOI 10.1063/1.2740261

    View details for Web of Science ID 000247625800007

    View details for PubMedID 17614531

  • Protein folding under confinement: A role for solvent PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Lucent, D., Vishal, V., Pande, V. S. 2007; 104 (25): 10430-10434


    Although most experimental and theoretical studies of protein folding involve proteins in vitro, the effects of spatial confinement may complicate protein folding in vivo. In this study, we examine the folding dynamics of villin (a small fast folding protein) with explicit solvent confined to an inert nanopore. We have calculated the probability of folding before unfolding (P(fold)) under various confinement regimes. Using P(fold) correlation techniques, we observed two competing effects. Confining protein alone promotes folding by destabilizing the unfolded state. In contrast, confining both protein and solvent gives rise to a solvent-mediated effect that destabilizes the native state. When both protein and solvent are confined we see unfolding to a compact unfolded state different from the unfolded state seen in bulk. Thus, we demonstrate that the confinement of solvent has a significant impact on protein kinetics and thermodynamics. We conclude with a discussion of the implications of these results for folding in confined environments such as the chaperonin cavity in vivo.

    View details for DOI 10.1073/pnas.0608256104

    View details for Web of Science ID 000247500000025

    View details for PubMedID 17563390

  • Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics JOURNAL OF CHEMICAL PHYSICS Chodera, J. D., Singhal, N., Pande, V. S., Dill, K. A., Swope, W. C. 2007; 126 (15)


    To meet the challenge of modeling the conformational dynamics of biological macromolecules over long time scales, much recent effort has been devoted to constructing stochastic kinetic models, often in the form of discrete-state Markov models, from short molecular dynamics simulations. To construct useful models that faithfully represent dynamics at the time scales of interest, it is necessary to decompose configuration space into a set of kinetically metastable states. Previous attempts to define these states have relied upon either prior knowledge of the slow degrees of freedom or on the application of conformational clustering techniques which assume that conformationally distinct clusters are also kinetically distinct. Here, we present a first version of an automatic algorithm for the discovery of kinetically metastable states that is generally applicable to solvated macromolecules. Given molecular dynamics trajectories initiated from a well-defined starting distribution, the algorithm discovers long lived, kinetically metastable states through successive iterations of partitioning and aggregating conformation space into kinetically related regions. The authors apply this method to three peptides in explicit solvent-terminally blocked alanine, the 21-residue helical F(s) peptide, and the engineered 12-residue beta-hairpin trpzip2-to assess its ability to generate physically meaningful states and faithful kinetic models.

    View details for DOI 10.1063/1.2714538

    View details for Web of Science ID 000245870900059

    View details for PubMedID 17461665

  • Local structure formation in simulations of two small proteins JOURNAL OF STRUCTURAL BIOLOGY Jayachandran, G., Vishal, V., Garcia, A. E., Pande, V. S. 2007; 157 (3): 491-499


    Massively parallel all-atom, explicit solvent molecular dynamics simulations were used to explore the formation and existence of local structure in two small alpha-helical proteins, the villin headpiece and the helical fragment B of protein A. We report on the existence of transient helices and combinations of helices in the unfolded ensemble, and on the order of formation of helices, which appears to largely agree with previous experimental results. Transient local structure is observed even in the absence of overall native structure. We also calculate sets of residue-residue pairs that are statistically predictive of the formation of given local structures in our simulations.

    View details for DOI 10.1016/j.jsb.2006.10.001

    View details for Web of Science ID 000244800400006

    View details for PubMedID 17098444

  • Predicting structure and dynamics of loosely-ordered protein complexes: Influenza hemagglutinin fusion peptide 13th Pacific Symposium on Biocomputing (PSB) Kasson, P. M., Pande, V. S. WORLD SCIENTIFIC PUBL CO PTE LTD. 2007: 40–50


    Transient and low-affinity protein complexes pose a challenge to existing experimental methods and traditional computational techniques for structural determination. One example of such a disordered complex is that formed by trimers of influenza virus fusion peptide inserted into a host cell membrane. This fusion peptide is responsible for mediating viral infection, and spectroscopic data suggest that the peptide forms loose multimeric associations that are important for viral infectivity. We have developed an ensemble simulation technique that harnesses >1000 molecular dynamics trajectories to build a structural model for the arrangement of fusion peptide trimers. We predict a trimer structure in which the fusion peptides are packed into proximity while maintaining their monomeric structure. Our model helps to explain the effects of several mutations to the fusion peptide that destroy viral infectivity but do not measurably alter peptide monomer structure. This approach also serves as a general model for addressing the challenging problem of higher-order protein organization in cell membranes.

    View details for Web of Science ID 000245296300005

    View details for PubMedID 17992744

  • Bayesian update method for adaptive weighted sampling PHYSICAL REVIEW E Park, S., Ensign, D. L., Pande, V. S. 2006; 74 (6)


    Exploring conformational spaces is still a challenging task for simulations of complex systems. One way to enhance such a task is weighted sampling, e.g., by assigning high weights to regions that are rarely sampled. It is, however, difficult to estimate adequate weights beforehand, and therefore adaptive methods are desired. Here we present a method for adaptive weighted sampling based on Bayesian inference. Within the framework of Bayesian inference, we develop an update scheme in which the information from previous data is stored in a prior distribution which is then updated to a posterior distribution according to new data. The method proposed here is particularly well suited for distributed computing, in which one must deal with rapid influxes of large amounts of data.

    View details for DOI 10.1103/PhysRevE.74.066703

    View details for Web of Science ID 000243165900064

    View details for PubMedID 17280173

  • Simulated unfolded-state ensemble and the experimental NMR structures of villin headpiece yield similar wide-angle solution X-ray scattering profiles JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Zagrovic, B., Pande, V. S. 2006; 128 (36): 11742-11743


    With the advent of powerful synchrotron sources, solution X-ray scattering is being increasingly used to get basic information about the structure of polypeptides. The solution scattering technique essentially provides one-dimensional data, which are then interpreted in terms of a three-dimensional structure through model building. Here we calculate wide-angle solution scattering patterns for an ensemble of simulated unfolded structures of villin headpiece, which differ from the native structure by rmsd = 8.8 +/- 1.0 A and have only negligible amounts of native secondary structure. We show that the wide-angle solution scattering pattern of such an ensemble shares significant similarity with the one based on the experimental NMR structures of the molecule. Our results suggest that solution scattering in the wide-angle limit, by itself, provides very little information about the secondary structure content of a polypeptide or its side-chain packing.

    View details for DOI 10.1021/ja0640694

    View details for Web of Science ID 000240291900007

    View details for PubMedID 16953598

  • Parallelized-over-parts computation of absolute binding free energy with docking and molecular dynamics JOURNAL OF CHEMICAL PHYSICS Jayachandran, G., Shirts, M. R., Park, S., Pande, V. S. 2006; 125 (8)


    We present a technique for biomolecular free energy calculations that exploits highly parallelized sampling to significantly reduce the time to results. The technique combines free energies for multiple, nonoverlapping configurational macrostates and is naturally suited to distributed computing. We describe a methodology that uses this technique with docking, molecular dynamics, and free energy perturbation to compute absolute free energies of binding quickly compared to previous methods. The method does not require a priori knowledge of the binding pose as long as the docking technique used can generate reasonable binding modes. We demonstrate the method on the protein FKBP12 and eight of its inhibitors.

    View details for DOI 10.1063/1.2221680

    View details for Web of Science ID 000240237000061

    View details for PubMedID 16965051

  • Ensemble molecular dynamics yields submillisecond kinetics and intermediates of membrane fusion PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Kasson, P. M., Kelley, N. W., Singhal, N., Vrljic, M., Brunger, A. T., Pande, V. S. 2006; 103 (32): 11916-11921


    Lipid membrane fusion is critical to cellular transport and signaling processes such as constitutive secretion, neurotransmitter release, and infection by enveloped viruses. Here, we introduce a powerful computational methodology for simulating membrane fusion from a starting configuration designed to approximate activated prefusion assemblies from neuronal and viral fusion, producing results on a time scale and degree of mechanistic detail not previously possible to our knowledge. We use an approach to the long time scale simulation of fusion by constructing a Markovian state model with large-scale distributed computing, yielding an understanding of fusion mechanisms on time scales previously impossible to simulate to our knowledge. Our simulation data suggest a branched pathway for fusion, in which a common stalk-like intermediate can either rapidly form a fusion pore or remain in a metastable hemifused state that slowly forms fully fused vesicles. This branched reaction pathway provides a mechanistic explanation both for the biphasic fusion kinetics and the stable hemifused intermediates previously observed experimentally. Our distributed computing and Markovian state model approaches provide sufficient sampling to detect rare transitions, a systematic process for analyzing reaction pathways, and the ability to develop quantitative approximations of reaction kinetics for fusion.

    View details for DOI 10.1073/pnas.0601597103

    View details for Web of Science ID 000239701900019

    View details for PubMedID 16880392

  • Electric fields at the active site of an enzyme: Direct comparison of experiment with theory SCIENCE Suydam, I. T., Snow, C. D., Pande, V. S., Boxer, S. G. 2006; 313 (5784): 200-204


    The electric fields produced in folded proteins influence nearly every aspect of protein function. We present a vibrational spectroscopy technique that measures changes in electric field at a specific site of a protein as shifts in frequency (Stark shifts) of a calibrated nitrile vibration. A nitrile-containing inhibitor is used to deliver a unique probe vibration to the active site of human aldose reductase, and the response of the nitrile stretch frequency is measured for a series of mutations in the enzyme active site. These shifts yield quantitative information on electric fields that can be directly compared with electrostatics calculations. We show that extensive molecular dynamics simulations and ensemble averaging are required to reproduce the observed changes in field.

    View details for DOI 10.1126/science.1127159

    View details for Web of Science ID 000239008000037

    View details for PubMedID 16840693

  • Kinetic definition of protein folding transition state ensembles and reaction coordinates BIOPHYSICAL JOURNAL Snow, C. D., Rhee, Y. M., Pande, V. S. 2006; 91 (1): 14-24


    Using distributed molecular dynamics simulations we located four distinct folding transitions for a 39-residue betabetaalphabeta protein fold. To characterize the nature of each room temperature transition, we calculated the probability of transmission for 500 points along each free energy barrier. We introduced a method for determining transition states by employing the transmission probability, Ptrans, and determined which conformations were transition state ensemble members (Ptrans approximately 0.5). The transmission probability may be used to characterize the barrier in several ways. For example, we ran simulations at 82 degrees C, determined the change in Ptrans with temperature for all 2,000 conformations, and quantified Hammond behavior directly using Ptrans correlation. Additionally, we propose that diffusion along Ptrans may provide the configurational diffusion rate at the top of the barrier. Specifically, given a transition state conformation x0 with estimated Ptrans=0.5, we selected a large set of subsequent conformations from independent trajectories, each exactly a small time deltat after x0 (250 ps). Calculating Ptrans for the new trial conformations, we generated the P(Ptrans|deltat=250 ps) distribution that reflected diffusion. This approach provides a novel perspective on the diffusive nature of a protein folding transition and provides a framework for a quantitative study of activated relaxation kinetics.

    View details for DOI 10.1529/biophysj.105.075689

    View details for Web of Science ID 000238288200005

    View details for PubMedID 16617068

  • Nanotube confinement denatures protein helices JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Sorin, E. J., Pande, V. S. 2006; 128 (19): 6316-6317


    In striking contrast to simple polymer physics theory, which does not account for solvent effects, we find that physical confinement of solvated biopolymers decreases solvent entropy, which in turn leads to a reduction in the organized structural content of the polymer. Since our theory is based on a fundamental property of water-protein statistical mechanics, we expect it to have broad implications in many biological and material science contexts.

    View details for DOI 10.1021/ja060917j

    View details for Web of Science ID 000237590400024

    View details for PubMedID 16683786

  • Using massively parallel simulation and Markovian models to study protein folding: Examining the dynamics of the villin headpiece JOURNAL OF CHEMICAL PHYSICS Jayachandran, G., Vishal, V., Pande, V. S. 2006; 124 (16)


    We report on the use of large-scale distributed computing simulation and novel analysis techniques for examining the dynamics of a small protein. Matters addressed include folding rate, very long time scale kinetics, ensemble properties, and interaction with water. The target system for the study, the villin headpiece, has been of great interest to experimentalists and theorists both. Sampling totaled nearly 500 mus-the most extensive published to date for a system of villin's size in explicit solvent with all atom detail-and was in the form of tens of thousands of independent molecular dynamics trajectories, each several tens of nanoseconds in length. We report on kinetics sensitivity analyses that, using a set of short simulations, probed the role of water in villin's folding and sensitivity to the simulation's electrostatics treatment. By constructing Markovian state models (MSMs) from the collected data, we were able to propagate dynamics to times far beyond those directly simulated and to rapidly compute mean first passage times, long time kinetics (tens of microseconds), and evolution of ensemble property distributions over long times, otherwise currently impossible. We also tested our MSM by using it to predict the structure of villin de novo.

    View details for DOI 10.1063/1.2186317

    View details for Web of Science ID 000237136700045

    View details for PubMedID 16674165

  • Kinetic computational alanine scanning: Application to p53 oligomerization JOURNAL OF MOLECULAR BIOLOGY Chong, L. T., Swope, W. C., Pitera, J. W., Pande, V. S. 2006; 357 (3): 1039-1049


    We have developed a novel computational alanine scanning approach that involves analysis of ensemble unfolding kinetics at high temperature to identify residues that are critical for the stability of a given protein. This approach has been applied to dimerization of the oligomerization domain (residues 326-355) of tumor suppressor p53. As validated by experimental results, our approach has reasonable success in identifying deleterious mutations, including mutations that have been linked to cancer. We discuss a method for determining the effect of mutations on the location of the dimerization transition state.

    View details for DOI 10.1016/j.jmb.2005.12.083

    View details for Web of Science ID 000236629300030

    View details for PubMedID 16457841

  • On the role of chemical detail in simulating protein folding kinetics CHEMICAL PHYSICS Rhee, Y. M., Pande, V. S. 2006; 323 (1): 66-77
  • Can conformational change be described by only a few normal modes? BIOPHYSICAL JOURNAL Petrone, P., Pande, V. S. 2006; 90 (5): 1583-1593


    We suggest a simple method to assess how many normal modes are needed to map a conformational change. By projecting the conformational change onto a subspace of the normal-mode vectors and using root mean square deviation as a test of accuracy, we find that the first 20 modes only contribute 50% or less of the total conformational change in four test cases (myosin, calmodulin, NtrC, and hemoglobin). In some allosteric systems, like the molecular switch NtrC, the conformational change is localized to a limited number of residues. We find that many more modes are necessary to accurately map this collective displacement. In addition, the normal-mode "spectra" can provide useful information about the details of the conformational change, especially when comparing structures with different bound ligands, in this case, calmodulin. Indeed, this approach presents normal-mode analysis as a useful basis in which to capture the mechanism of conformational change, and shows that the number of normal modes needed to capture the essential collective motions of atoms should be chosen according to the required accuracy.

    View details for DOI 10.1529/biophysj.105.070045

    View details for Web of Science ID 000235235600012

    View details for PubMedID 16361336

  • The solvation interface is a determining factor in peptide conformational preferences JOURNAL OF MOLECULAR BIOLOGY Sorin, E. J., Rhee, Y. M., Shirts, M. R., Pande, V. S. 2006; 356 (1): 248-256


    The 21 residue polyalanine-based F(s) peptide was studied using thousands of long, explicit solvent, atomistic molecular dynamics simulations that reached equilibrium at the ensemble level. Peptide conformational preference as a function of hydrophobicity was examined using a spectrum of explicit solvent models, and the peptide length-dependence of the hydrophilic and hydrophobic components of solvent-accessible surface area for several ideal conformational types was considered. Our results demonstrate how the character of the solvation interface induces several conformational preferences, including a decrease in mean helical content with increased hydrophilicity, which occurs predominantly through reduced nucleation tendency and, to a lesser extent, destabilization of helical propagation. Interestingly, an opposing effect occurs through increased propensity for 3(10)-helix conformations, as well as increased polyproline structure. Our observations provide a framework for understanding previous reports of conformational preferences in polyalanine-based peptides including (i) terminal 3(10)-helix prominence, (ii) low pi-helix propensity, (iii) increased polyproline conformations in short and unfolded peptides, and (iv) membrane helix stability in the presence and absence of water. These observations provide physical insight into the role of water in peptide conformational equilibria at the atomic level, and expand our view of the complexity of even the most "simple" of biopolymers. Whereas previous studies have focused predominantly on hydrophobic effects with respect to tertiary structure, this work highlights the need for consideration of such effects at the secondary structural level.

    View details for DOI 10.1016/j.jmb.2005.11.058

    View details for Web of Science ID 000234938600020

    View details for PubMedID 16364361

  • Validation of Markov state models using Shannon's entropy JOURNAL OF CHEMICAL PHYSICS Park, S., Pande, V. S. 2006; 124 (5)


    Markov state models are kinetic models built from the dynamics of molecular simulation trajectories by grouping similar configurations into states and examining the transition probabilities between states. Here we present a procedure for validating the underlying Markov assumption in Markov state models based on information theory using Shannon's entropy. This entropy method is applied to a simple system and is compared with the previous eigenvalue method. The entropy method also provides a way to identify states that are least Markovian, which can then be divided into finer states to improve the model.

    View details for DOI 10.1063/1.2166393

    View details for Web of Science ID 000235171100018

    View details for PubMedID 16468862

  • A new set of molecular mechanics parameters for hydroxyproline and its use in molecular dynamics simulations of collagen-like peptides JOURNAL OF COMPUTATIONAL CHEMISTRY Park, S., Radmer, R. J., Klein, T. E., Pande, V. S. 2005; 26 (15): 1612-1616


    Recently, the importance of proline ring pucker conformations in collagen has been suggested in the context of hydroxylation of prolines. The previous molecular mechanics parameters for hydroxyproline, however, do not reproduce the correct pucker preference. We have developed a new set of parameters that reproduces the correct pucker preference. Our molecular dynamics simulations of proline and hydroxyproline monomers as well as collagen-like peptides, using the new parameters, support the theory that the role of hydroxylation in collagen is to stabilize the triple helix by adjusting to the right pucker conformation (and thus the right phi angle) in the Y position.

    View details for DOI 10.1002/jcc.20301

    View details for Web of Science ID 000232570300006

    View details for PubMedID 16170799

  • Error analysis and efficient sampling in Markovian state models for molecular dynamics JOURNAL OF CHEMICAL PHYSICS Singhal, N., Pande, V. S. 2005; 123 (20)


    In previous work, we described a Markovian state model (MSM) for analyzing molecular-dynamics trajectories, which involved grouping conformations into states and estimating the transition probabilities between states. In this paper, we analyze the errors in this model caused by finite sampling. We give different methods with various approximations to determine the precision of the reported mean first passage times. These approximations are validated on an 87 state toy Markovian system. In addition, we propose an efficient and practical sampling algorithm that uses these error calculations to build a MSM that has the same precision in mean first passage time values but requires an order of magnitude fewer samples. We also show how these methods can be scaled to large systems using sparse matrix methods.

    View details for DOI 10.1063/1.2116947

    View details for Web of Science ID 000233661000088

    View details for PubMedID 16351319

  • Building Internet distributed computing DR DOBBS JOURNAL Peck, C., Hursey, J., McCoy, J., Pande, V. 2005; 30 (11): 39-41
  • How large is an alpha-helix? Studies of the radii of gyration of helical peptides by small-angle X-ray scattering and molecular dynamics JOURNAL OF MOLECULAR BIOLOGY Zagrovic, B., Jayachandran, G., Millett, I. S., Doniach, S., Pande, V. S. 2005; 353 (2): 232-241


    Using synchrotron radiation and the small-angle X-ray scattering technique we have measured the radii of gyration of a series of alanine-based alpha-helix-forming peptides of the composition Ace-(AAKAA)(n)-GY-NH(2), n=2-7, in aqueous solvent at 10(+/-1) degrees C. In contrast to other techniques typically used to study alpha-helices in isolation (such as nuclear magnetic resonance and circular dichroism), small-angle X-ray scattering reports on the global structure of a molecule and, as such, provides complementary information to these other, more sequence-local measuring techniques. The radii of gyration that we measure are, except for the 12-mer, lower than the radii of gyration of ideal alpha-helices or helices with frayed ends of the equivalent sequence-length. For example, the measured radius of gyration of the 37-mer is 14.2(+/-0.6)A, which is to be compared with the radius of gyration of an ideal 37-mer alpha-helix of 17.6A. Attempts are made to analyze the origin of this discrepancy in terms of the analytical Zimm-Bragg-Nagai (ZBN) theory, as well as distributed computing explicit solvent molecular dynamics simulations using two variants of the AMBER force-field. The ZBN theory, which treats helices as cylinders connected by random walk segments, predicts markedly larger radii of gyration than those measured. This is true even when the persistence length of the random walk parts is taken to be extremely short (about one residue). Similarly, the molecular dynamics simulations, at the level of sampling available to us, give inaccurate values of the radii of gyration of the molecules (by overestimating them by around 25% for longer peptides) and/or their helical content. We conclude that even at the short sequences examined here (< or =37 amino acid residues), these alpha-helical peptides behave as fluctuating semi-broken rods rather than straight cylinders with frayed ends.

    View details for DOI 10.1016/j.jmb.2005.08.053

    View details for Web of Science ID 000232505600003

    View details for PubMedID 16171817

  • Foldamer dynamics expressed via Markov state models. I. Explicit solvent molecular-dynamics simulations in acetonitrile, chloroform, methanol, and water JOURNAL OF CHEMICAL PHYSICS Elmer, S. P., Park, S., Pande, V. S. 2005; 123 (11)


    In this article, we analyze the folding dynamics of an all-atom model of a polyphenylacetylene (pPA) 12-mer in explicit solvent for four common organic and aqueous solvents: acetonitrile, chloroform, methanol, and water. The solvent quality has a dramatic effect on the time scales in which pPA 12-mers fold. Acetonitrile was found to manifest ideal folding conditions as suggested by optimal folding times on the order of approximately 100-200 ns, depending on temperature. In contrast, chloroform and water were observed to hinder the folding of the pPA 12-mer due to extreme solvation conditions relative to acetonitrile; chloroform denatures the oligomer, whereas water promotes aggregation and traps. The pPA 12-mer in a pure methanol solution folded in approximately 400 ns at 300 K, compared relative to the experimental 12-mer folding time of approximately 160 ns measured in a 1:1 v/v THF/methanol solution. Requisite in drawing the aforementioned conclusions, analysis techniques based on Markov state models are applied to multiple short independent trajectories to extrapolate the long-time scale dynamics of the 12-mer in each respective solvent. We review the theory of Markov chains and derive a method to impose detailed balance on a transition-probability matrix computed from simulation data.

    View details for DOI 10.1063/1.2001648

    View details for Web of Science ID 000232033800051

    View details for PubMedID 16392592

  • Foldamer dynamics expressed via Markov state models. II. State space decomposition JOURNAL OF CHEMICAL PHYSICS Elmer, S. P., Park, S., Pande, V. S. 2005; 123 (11)


    The structural landscape of poly-phenylacetylene (pPA), otherwise known as m-phenylene ethynylene oligomers, has been shown to consist of a very diverse set of conformations, including helices, turns, and knots. Defining a state space decomposition to classify these conformations into easily identifiable states is an important step in understanding the dynamics in relation to Markov state models. We define the state decomposition of pPA oligomers in terms of the sequence of discretized dihedral angles between adjacent phenyl rings along the oligomer backbone. Furthermore, we derive in mathematical detail an approach to further reduce the number of states by grouping symmetrically equivalent states into a single parent state. A more challenging problem requires a formal definition for knotted states in the structural landscape. Assuming that the oligomer chain can only cross the ideal helix path once, we propose a technique to define a knotted state derived from a helical state determined by the position along the helical nucleus where the chain crosses the ideal helix path. Several examples of helical states and knotted states from the pPA 12-mer illustrate the principles outlined in this article.

    View details for DOI 10.1063/1.2008230

    View details for Web of Science ID 000232033800052

    View details for PubMedID 16392593

  • Direct calculation of the binding free energies of FKBP ligands JOURNAL OF CHEMICAL PHYSICS Fujitani, H., Tanida, Y., Ito, M., Jayachandran, G., Snow, C. D., Shirts, M. R., Sorin, E. J., Pande, V. S. 2005; 123 (8)


    Direct calculations of the absolute free energies of binding for eight ligands to FKBP protein were performed using the Fujitsu BioServer massively parallel computer. Using the latest version of the general assisted model building with energy refinement (AMBER) force field for ligand model parameters and the Bennett acceptance ratio for computing free-energy differences, we obtained an excellent linear fit between the calculated and experimental binding free energies. The rms error from a linear fit is 0.4 kcal/mol for eight ligand complexes. In comparison with a previous study of the binding energies of these same eight ligand complexes, these results suggest that the use of improved model parameters can lead to more predictive binding estimates, and that these estimates can be obtained with significantly less computer time than previously thought. These findings make such direct methods more attractive for use in rational drug design.

    View details for DOI 10.1063/1.1999637

    View details for Web of Science ID 000231598600009

    View details for PubMedID 16164283

  • Unusual compactness of a polyproline type II structure PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Zagrovic, B., Lipfert, J., Sorin, E. J., Millettt, I. S., van Gunsteren, W. F., Doniach, S., Pande, V. S. 2005; 102 (33): 11698-11703


    Polyproline type II (PPII) helix has emerged recently as the dominant paradigm for describing the conformation of unfolded polypeptides. However, most experimental observables used to characterize unfolded proteins typically provide only short-range, sequence-local structural information that is both time- and ensemble-averaged, giving limited detail about the long-range structure of the chain. Here, we report a study of a long-range property: the radius of gyration of an alanine-based peptide, Ace-(diaminobutyric acid)2-(Ala)7-(ornithine)2-NH2. This molecule has previously been studied as a model for the unfolded state of proteins under folding conditions and is believed to adopt a PPII fold based on short-range techniques such as NMR and CD. By using synchrotron radiation and small-angle x-ray scattering, we have determined the radius of gyration of this peptide to be 7.4 +/- 0.5 angstroms, which is significantly less than the value expected from an ideal PPII helix in solution (13.1 angstroms). To further study this contradiction, we have used molecular dynamics simulations using six variants of the AMBER force field and the GROMOS 53A6 force field. However, in all cases, the simulated ensembles underestimate the PPII content while overestimating the experimental radius of gyration. The conformational model that we propose, based on our small angle x-ray scattering results and what is known about this molecule from before, is that of a very flexible, fluctuating structure that on the level of individual residues explores a wide basin around the ideal PPII geometry but is never, or only rarely, in the ideal extended PPII helical conformation.

    View details for DOI 10.1073/pnas.0409693102

    View details for Web of Science ID 000231317000025

    View details for PubMedID 16085707

  • Empirical force-field assessment: The interplay between backbone torsions and noncovalent term scaling JOURNAL OF COMPUTATIONAL CHEMISTRY Sorin, E. J., Pande, V. S. 2005; 26 (7): 682-690


    The kinetic and thermodynamic aspects of the helix-coil transition in polyalanine-based peptides have been studied at the ensemble level using a distributed computing network. This study builds on a previous report, which critically assessed the performance of several contemporary force fields in reproducing experimental measurements and elucidated the complex nature of helix-coil systems. Here we consider the effects of modifying backbone torsions and the scaling of noncovalent interactions. Although these elements determine the potential of mean force between atoms separated by three covalent bonds (and thus largely determine the local conformational distributions observed in simulation), we demonstrate that the interplay between these factors is both complex and force field dependent. We quantitatively assess the heliophilicity of several helix-stabilizing potentials as well as the changes in heliophilicity resulting from such modifications, which can "make or break" the accuracy of a given force field, and our findings suggests that future force field development may need to better consider effect that vary with peptide length. This report also serves as an example of the utility of distributed computing in analyzing and improving upon contemporary force fields at the level of absolute ensemble equilibrium, the next step in force field development.

    View details for DOI 10.1002/jcc.020208

    View details for Web of Science ID 000228372800004

    View details for PubMedID 15754305

  • One-dimensional reaction coordinate and the corresponding potential of mean force from commitment probability distribution JOURNAL OF PHYSICAL CHEMISTRY B Rhee, Y. M., Pande, V. S. 2005; 109 (14): 6780-6786


    In general, finding a one-dimensional representation of the kinetics of a high-dimensional system is a great simplification for the study of complex systems. Here, we propose a method to obtain a reaction coordinate whose potential of the mean force can reproduce the commitment probability distribution from the multidimensional surface. We prove that such a relevant one-dimensional representation can be readily calculated from the equilibrium distribution of commitment probabilities, which can be obtained with simulations. Also, it is shown that this representation is complementary to a previously proposed one-dimensional representation based on a quadratic approximation of the potential energy surface. The usefulness of the method is examined with dynamics in a two-dimensional system, showing that the one-dimensional surface thus obtained can predict the existence of an intermediate and the occurrence of path switching without a priori knowledge of the morphology of the original surface. The applicability of the method to more complex and realistic reactions such as protein folding is also discussed.

    View details for DOI 10.1021/jp045544s

    View details for Web of Science ID 000228231200041

    View details for PubMedID 16851763

  • Comparison of efficiency and bias of free energies computed by exponential averaging, the Bennett acceptance ratio, and thermodynamic integration JOURNAL OF CHEMICAL PHYSICS Shirts, M. R., Pande, V. S. 2005; 122 (14)


    Recent work has demonstrated the Bennett acceptance ratio method is the best asymptotically unbiased method for determining the equilibrium free energy between two end states given work distributions collected from either equilibrium and nonequilibrium data. However, it is still not clear what the practical advantage of this acceptance ratio method is over other common methods in atomistic simulations. In this study, we first review theoretical estimates of the bias and variance of exponential averaging (EXP), thermodynamic integration (TI), and the Bennett acceptance ratio (BAR). In the process, we present a new simple scheme for computing the variance and bias of many estimators, and demonstrate the connections between BAR and the weighted histogram analysis method. Next, a series of analytically solvable toy problems is examined to shed more light on the relative performance in terms of the bias and efficiency of these three methods. Interestingly, it is impossible to conclusively identify a "best" method for calculating the free energy, as each of the three methods performs more efficiently than the others in at least one situation examined in these toy problems. Finally, sample problems of the insertion/deletion of both a Lennard-Jones particle and a much larger molecule in TIP3P water are examined by these three methods. In all tests of atomistic systems, free energies obtained with BAR have significantly lower bias and smaller variance than when using EXP or TI, especially when the overlap in phase space between end states is small. For example, BAR can extract as much information from multiple fast, far-from-equilibrium simulations as from fewer simulations near equilibrium, which EXP cannot. Although TI and sometimes even EXP can be somewhat more efficient in idealized toy problems, in the realistic atomistic situations tested in this paper, BAR is significantly more efficient than all other methods.

    View details for DOI 10.1063/1.1873592

    View details for Web of Science ID 000228559000010

    View details for PubMedID 15847516

  • Exploring the helix-coil transition via all-atom equilibrium ensemble simulations BIOPHYSICAL JOURNAL Sorin, E. J., Pande, V. S. 2005; 88 (4): 2472-2493


    The ensemble folding of two 21-residue alpha-helical peptides has been studied using all-atom simulations under several variants of the AMBER potential in explicit solvent using a global distributed computing network. Our extensive sampling, orders of magnitude greater than the experimental folding time, results in complete convergence to ensemble equilibrium. This allows for a quantitative assessment of these potentials, including a new variant of the AMBER-99 force field, denoted AMBER-99 phi, which shows improved agreement with experimental kinetic and thermodynamic measurements. From bulk analysis of the simulated AMBER-99 phi equilibrium, we find that the folding landscape is pseudo-two-state, with complexity arising from the broad, shallow character of the "native" and "unfolded" regions of the phase space. Each of these macrostates allows for configurational diffusion among a diverse ensemble of conformational microstates with greatly varying helical content and molecular size. Indeed, the observed structural dynamics are better represented as a conformational diffusion than as a simple exponential process, and equilibrium transition rates spanning several orders of magnitude are reported. After multiple nucleation steps, on average, helix formation proceeds via a kinetic "alignment" phase in which two or more short, low-entropy helical segments form a more ideal, single-helix structure.

    View details for Web of Science ID 000227986300010

    View details for PubMedID 15665128

  • Solvation free energies of amino acid side chain analogs for common molecular mechanics water models JOURNAL OF CHEMICAL PHYSICS Shirts, M. R., Pande, V. S. 2005; 122 (13)


    Quantitative free energy computation involves both using a model that is sufficiently faithful to the experimental system under study (accuracy) and establishing statistically meaningful measures of the uncertainties resulting from finite sampling (precision). In order to examine the accuracy of a range of common water models used for protein simulation for their solute/solvent properties, we calculate the free energy of hydration of 15 amino acid side chain analogs derived from the OPLS-AA parameter set with the TIP3P, TIP4P, SPC, SPC/E, TIP3P-MOD, and TIP4P-Ew water models. We achieve a high degree of statistical precision in our simulations, obtaining uncertainties for the free energy of hydration of 0.02-0.06 kcal/mol, equivalent to that obtained in experimental hydration free energy measurements of the same molecules. We find that TIP3P-MOD, a model designed to give improved free energy of hydration for methane, gives uniformly the closest match to experiment; we also find that the ability to accurately model pure water properties does not necessarily predict ability to predict solute/solvent behavior. We also evaluate the free energies of a number of novel modifications of TIP3P designed as a proof of concept that it is possible to obtain much better solute/solvent free energetic behavior without substantially negatively affecting pure water properties. We decrease the average error to zero while reducing the root mean square error below that of any of the published water models, with measured liquid water properties remaining almost constant with respect to our perturbations. This demonstrates there is still both room for improvement within current fixed-charge biomolecular force fields and significant parameter flexibility to make these improvements. Recent research in computational efficiency of free energy methods allows us to perform simulations on a local cluster that previously required large scale distributed computing, performing four times as much computational work in approximately a tenth of the computer time as a similar study a year ago.

    View details for DOI 10.1063/1.1877132

    View details for Web of Science ID 000228390100036

    View details for PubMedID 15847482

  • Does water play a structural role in the folding of small nucleic acids? BIOPHYSICAL JOURNAL Sorin, E. J., Rhee, Y. M., Pande, V. S. 2005; 88 (4): 2516-2524


    Nucleic acid structure and dynamics are known to be closely coupled to local environmental conditions and, in particular, to the ionic character of the solvent. Here we consider what role the discrete properties of water and ions play in the collapse and folding of small nucleic acids. We study the folding of an experimentally well-characterized RNA hairpin-loop motif (sequence 5'-GGGC[GCAA]GCCU-3') via ensemble molecular dynamics simulation and, with nearly 500 micros of aggregate simulation time using an explicit representation of the ionic solvent, report successful ensemble folding simulations with a predicted folding time of 8.8(+/-2.0) micros, in agreement with experimental measurements of approximately 10 micros. Comparing our results to previous folding simulations using the GB/SA continuum solvent model shows that accounting for water-mediated interactions is necessary to accurately characterize the free energy surface and stochastic nature of folding. The formation of the secondary structure appears to be more rapid than the fastest ionic degrees of freedom, and counterions do not participate discretely in observed folding events. We find that hydrophobic collapse follows a predominantly expulsive mechanism in which a diffusion-search of early structural compaction is followed by the final formation of native structure that occurs in tandem with solvent evacuation.

    View details for Web of Science ID 000227986300012

    View details for PubMedID 15681648

  • Length dependent folding kinetics of phenylacetylene oligomers: Structural characterization of a kinetic trap JOURNAL OF CHEMICAL PHYSICS Elmer, S. P., Pande, V. S. 2005; 122 (12)


    Using simulation to study the folding kinetics of 20-mer poly-phenylacetylene (pPA) oligomers, we find a long time scale trapped kinetic phase in the cumulative folding time distribution. This is demonstrated using molecular dynamics to simulate an ensemble of over 100 folding trajectories. The simulation data are fit to a four-state kinetic model which includes the typical folded and unfolded states, along with an intermediate state, and most surprisingly, a kinetically trapped state. Topologically diverse conformations reminiscent of alpha helices, beta turns, and sheets in proteins are observed, along with unique structures in the form of knots. The nonhelical conformations are implicated, on the basis of structural correlations to kinetic parameters, to contribute to the trapped kinetic behavior. The strong solvophobic forces which mediate the folding process and produce a stable helical folded state also serve to overstabilize the nonhelical conformations, ultimately trapping them. From our simulations, the folding time is predicted to be on the order of 2.5-12.5 mus in the presence of the trapped kinetic phase. The folding mechanism for these 20-mer chains is compared with the previously reported folding mechanism for the pPA 12-mer chains. A linear scaling relationship between the chain length and the mean first passage time is predicted in the absence of the trapped kinetic phase. We discuss the major implications of this discovery in the design of self-assembling nanostructures.

    View details for DOI 10.1063/1.1867375

    View details for Web of Science ID 000228287900063

    View details for PubMedID 15836425

  • Dimerization of the p53 oligomerization domain: Identification of a folding nucleus by molecular dynamics simulations JOURNAL OF MOLECULAR BIOLOGY Chong, L. T., Snow, C. D., Rhee, Y. M., Pande, V. S. 2005; 345 (4): 869-878


    Dimerization of the p53 oligomerization domain involves coupled folding and binding of monomers. To examine the dimerization, we have performed molecular dynamics (MD) simulations of dimer folding from the rate-limiting transition state ensemble (TSE). Among 799 putative transition state structures that were selected from a large ensemble of high-temperature unfolding trajectories, 129 were identified as members of the TSE via calculation of a 50% transmission coefficient from at least 20 room-temperature simulations. This study is the first to examine the refolding of a protein dimer using MD simulations in explicit water, revealing a folding nucleus for dimerization. Our atomistic simulations are consistent with experiment and offer insight that was previously unobtainable.

    View details for DOI 10.1016/j.jmb.2004.10.083

    View details for Web of Science ID 000225909100019

    View details for PubMedID 15588832

  • How well can simulation predict protein folding kinetics and thermodynamics? ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE Snow, C. D., Sorin, E. J., Rhee, Y. M., Pande, V. S. 2005; 34: 43-69


    Simulation of protein folding has come a long way in five years. Notably, new quantitative comparisons with experiments for small, rapidly folding proteins have become possible. As the only way to validate simulation methodology, this achievement marks a significant advance. Here, we detail these recent achievements and ask whether simulations have indeed rendered quantitative predictions in several areas, including protein folding kinetics, thermodynamics, and physics-based methods for structure prediction. We conclude by looking to the future of such comparisons between simulations and experiments.

    View details for Web of Science ID 000230099600003

    View details for PubMedID 15869383

  • Foldamer simulations: Novel computational methods and applications to poly-phenylacetylene oligomers JOURNAL OF CHEMICAL PHYSICS Elmer, S. P., Pande, V. S. 2004; 121 (24): 12760-12771


    We apply several methods to probe the ensemble kinetic and structural properties of a model system of poly-phenylacetylene (pPA) oligomer folding trajectories. The kinetic methods employed included a brute force accounting of conformations, a Markovian state matrix method, and a nonlinear least squares fit to a minimalist kinetic model used to extract the folding time. Each method gave similar measures for the folding time of the 12-mer chain, calculated to be on the order of 7 ns for the complete folding of the chain from an extended conformation. Utilizing both a linear and a nonlinear scaling relationship between the viscosity and the folding time to correct for a low simulation viscosity, we obtain an upper and a lower bound for the approximate folding time within the range 70 ns

    View details for DOI 10.1063/1.1812272

    View details for Web of Science ID 000225714500073

    View details for PubMedID 15606301

  • How does averaging affect protein structure comparison on the ensemble level? BIOPHYSICAL JOURNAL Zagrovic, B., Pande, V. S. 2004; 87 (4): 2240-2246


    Recent algorithmic advances and continual increase in computational power have made it possible to simulate protein folding and dynamics on the level of ensembles. Furthermore, analyzing protein structure by using ensemble representation is intrinsic to certain experimental techniques, such as nuclear magnetic resonance. This creates a problem of how to compare an ensemble of molecules with a given reference structure. Recently, we used distance-based root-mean-square deviation (dRMS) to compare the native structure of a protein with its unfolded-state ensemble. We showed that for small, mostly alpha-helical proteins, the mean unfolded-state Calpha-Calpha distance matrix is significantly more nativelike than the Calpha-Calpha matrices corresponding to the individual members of the unfolded ensemble. Here, we give a mathematical derivation that shows that, for any ensemble of structures, the dRMS deviation between the ensemble-averaged distance matrix and any given reference distance matrix is always less than or equal to the average dRMS deviation of the individual members of the ensemble from the same reference matrix. This holds regardless of the nature of the reference structure or the structural ensemble in question. In other words, averaging of distance matrices can only increase their level of similarity to a given reference matrix, relative to the individual matrices comprising the ensemble. Furthermore, we show that the above inequality holds in the case of Cartesian coordinate-based root-mean-square deviation as well. We discuss this in the context of our proposal that the average structure of the unfolded ensemble of small helical proteins is close to the native structure, and demonstrate that this finding goes beyond the above mathematical fact.

    View details for DOI 10.1529/biophysj.104.042184

    View details for Web of Science ID 000224129200013

    View details for PubMedID 15454426

  • A universal TANGO? NATURE BIOTECHNOLOGY Pande, V. S. 2004; 22 (10): 1240-1241

    View details for DOI 10.1038/nbt1004-1240

    View details for Web of Science ID 000224326100022

    View details for PubMedID 15470460

  • Random-coil behavior and the dimensions of chemically unfolded proteins PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Kohn, J. E., Millett, I. S., JACOB, J., Zagrovic, B., Dillon, T. M., Cingel, N., Dothager, R. S., Seifert, S., Thiyagarajan, P., Sosnick, T. R., Hasan, M. Z., Pande, V. S., Ruczinski, I., Doniach, S., Plaxco, K. W. 2004; 101 (34): 12491-12496


    Spectroscopic studies have identified a number of proteins that appear to retain significant residual structure under even strongly denaturing conditions. Intrinsic viscosity, hydrodynamic radii, and small-angle x-ray scattering studies, in contrast, indicate that the dimensions of most chemically denatured proteins scale with polypeptide length by means of the power-law relationship expected for random-coil behavior. Here we further explore this discrepancy by expanding the length range of characterized denatured-state radii of gyration (R(G)) and by reexamining proteins that reportedly do not fit the expected dimensional scaling. We find that only 2 of 28 crosslink-free, prosthetic-group-free, chemically denatured polypeptides deviate significantly from a power-law relationship with polymer length. The R(G) of the remaining 26 polypeptides, which range from 16 to 549 residues, are well fitted (r(2) = 0.988) by a power-law relationship with a best-fit exponent, 0.598 +/- 0.028, coinciding closely with the 0.588 predicted for an excluded volume random coil. Therefore, it appears that the mean dimensions of the large majority of chemically denatured proteins are effectively indistinguishable from the mean dimensions of a random-coil ensemble.

    View details for DOI 10.1073/pnas.0403643101

    View details for Web of Science ID 000223596200019

    View details for PubMedID 15314214

  • Using path sampling to build better Markovian state models: Predicting the folding rate and mechanism of a tryptophan zipper beta hairpin JOURNAL OF CHEMICAL PHYSICS Singhal, N., Snow, C. D., Pande, V. S. 2004; 121 (1): 415-425


    We propose an efficient method for the prediction of protein folding rate constants and mechanisms. We use molecular dynamics simulation data to build Markovian state models (MSMs), discrete representations of the pathways sampled. Using these MSMs, we can quickly calculate the folding probability (P(fold)) and mean first passage time of all the sampled points. In addition, we provide techniques for evaluating these values under perturbed conditions without expensive recomputations. To demonstrate this method on a challenging system, we apply these techniques to a two-dimensional model energy landscape and the folding of a tryptophan zipper beta hairpin.

    View details for DOI 10.1063/1.1738647

    View details for Web of Science ID 000222112100047

    View details for PubMedID 15260562

  • Molecular dynamics simulation of lipid reorientation at bilayer edges BIOPHYSICAL JOURNAL Kasson, P. M., Pande, V. S. 2004; 86 (6): 3744-3749


    Understanding cellular membrane processes is critical for the study of events such as viral entry, neurotransmitter exocytosis, and immune activation. Supported lipid bilayers are commonly used to model these membrane processes experimentally. Despite the relative simplicity of such a system, many important structural and dynamic parameters are not experimentally observable with current techniques. Computational approaches allow the development of a high-resolution model of bilayer processes. We have performed molecular dynamics simulations of dimyristoylphosphatidylcholine (DMPC) bilayers to model the creation of bilayer gaps-a common process in bilayer patterning-and to analyze their structure and dynamics. We propose a model for gap formation in which the bilayer edges form metastable micelle-like structures on a nanosecond timescale. Molecules near edges structurally resemble lipids in ungapped bilayers but undergo small-scale motions more rapidly. These data suggest that lipids may undergo rapid local rearrangements during membrane fusion, facilitating the formation of fusion intermediates thought key to the infection cycle of viruses such as influenza, Ebola, and HIV.

    View details for DOI 10.1529/biophysj.103.029652

    View details for Web of Science ID 000222035200033

    View details for PubMedID 15189870

  • Simulations of the role of water in the protein-folding mechanism PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Rhee, Y. M., Sorin, E. J., Jayachandran, G., Lindahl, E., Pande, V. S. 2004; 101 (17): 6456-6461


    There are many unresolved questions regarding the role of water in protein folding. Does water merely induce hydrophobic forces, or does the discrete nature of water play a structural role in folding? Are the nonadditive aspects of water important in determining the folding mechanism? To help to address these questions, we have performed simulations of the folding of a model protein (BBA5) in explicit solvent. Starting 10,000 independent trajectories from a fully unfolded conformation, we have observed numerous folding events, making this work a comprehensive study of the kinetics of protein folding starting from the unfolded state and reaching the folded state and with an explicit solvation model and experimentally validated rates. Indeed, both the raw TIP3P folding rate (4.5 +/- 2.5 micros) and the diffusion-constant corrected rate (7.5 +/- 4.2 micros) are in strong agreement with the experimentally observed rate of 7.5 +/- 3.5 micros. To address the role of water in folding, the mechanism is compared with that predicted from implicit solvation simulations. An examination of solvent density near hydrophobic groups during folding suggests that in the case of BBA5, there are water-induced effects not captured by implicit solvation models, including signs of a "concurrent mechanism" of core collapse and desolvation.

    View details for DOI 10.1073/pnas.0307898101

    View details for Web of Science ID 000221107900025

    View details for PubMedID 15090647

  • Folding probabilities: A novel approach to folding transitions and the two-dimensional Ising-model JOURNAL OF CHEMICAL PHYSICS Lenz, P., Zagrovic, B., Shapiro, J., Pande, V. S. 2004; 120 (14): 6769-6778


    The theoretical concept of folding probability, p(fold), has proven to be a useful means to characterize the kinetics of protein folding. Here, we illustrate the practical importance of p(fold) and demonstrate how it can be determined theoretically. We derive a general analytical expression for p(fold) and show how it can be estimated from simulations for systems where the transition rates between the relevant microstates are not known. By analyzing the Ising model we are able to determine the scaling behavior of the numerical error in the p(fold) estimate as function of the number of analyzed Monte Carlo runs. We apply our method to a simple, newly developed protein folding model for the formation of alpha helices. It is demonstrated that our technique highly parallelizes the calculation of p(fold) and that it is orders of magnitude more efficient than conventional approaches.

    View details for DOI 10.1063/1.1667470

    View details for Web of Science ID 000220456400051

    View details for PubMedID 15267572

  • Does native state topology determine the RNA folding mechanism? JOURNAL OF MOLECULAR BIOLOGY Sorin, E. J., Nakatani, B. J., Rhee, Y. M., Jayachandran, G., Vishal, V., Pande, V. S. 2004; 337 (4): 789-797


    Recent studies in protein folding suggest that native state topology plays a dominant role in determining the folding mechanism, yet an analogous statement has not been made for RNA, most likely due to the strong coupling between the ionic environment and conformational energetics that make RNA folding more complex than protein folding. Applying a distributed computing architecture to sample nearly 5000 complete tRNA folding events using a minimalist, atomistic model, we have characterized the role of native topology in tRNA folding dynamics: the simulated bulk folding behavior predicts well the experimentally observed folding mechanism. In contrast, single-molecule folding events display multiple discrete folding transitions and compose a largely diverse, heterogeneous dynamic ensemble. This both supports an emerging view of heterogeneous folding dynamics at the microscopic level and highlights the need for single-molecule experiments and both single-molecule and bulk simulations in interpreting bulk experimental measurements.

    View details for DOI 10.1016/j.jmb.2004.02.024

    View details for Web of Science ID 000220515500002

    View details for PubMedID 15033351

  • Trp zipper folding kinetics by molecular dynamics and temperature-jump spectroscopy PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Snow, C. D., Qiu, L. L., Du, D. G., Gai, F., Hagen, S. J., Pande, V. S. 2004; 101 (12): 4077-4082


    We studied the microsecond folding dynamics of three beta hairpins (Trp zippers 1-3, TZ1-TZ3) by using temperature-jump fluorescence and atomistic molecular dynamics in implicit solvent. In addition, we studied TZ2 by using time-resolved IR spectroscopy. By using distributed computing, we obtained an aggregate simulation time of 22 ms. The simulations included 150, 212, and 48 folding events at room temperature for TZ1, TZ2, and TZ3, respectively. The all-atom optimized potentials for liquid simulations (OPLS(aa)) potential set predicted TZ1 and TZ2 properties well; the estimated folding rates agreed with the experimentally determined folding rates and native conformations were the global potential-energy minimum. The simulations also predicted reasonable unfolding activation enthalpies. This work, directly comparing large simulated folding ensembles with multiple spectroscopic probes, revealed both the surprising predictive ability of current models as well as their shortcomings. Specifically, for TZ1-TZ3, OPLS for united atom models had a nonnative free-energy minimum, and the folding rate for OPLS(aa) TZ3 was sensitive to the initial conformation. Finally, we characterized the transition state; all TZs fold by means of similar, native-like transition-state conformations.

    View details for DOI 10.1073/pnas.0305260101

    View details for Web of Science ID 000220472200018

    View details for PubMedID 15020773

  • Structural correspondence between the alpha-helix and the random-flight chain resolves how unfolded proteins can have native-like properties NATURE STRUCTURAL BIOLOGY Zagrovic, B., Pande, V. S. 2003; 10 (11): 955-961


    Recently, we have proposed that, on average, the structure of the unfolded state of small, mostly alpha-helical proteins may be similar to the native structure (the 'mean-structure' hypothesis). After examining thousands of simulations of both the folded and the unfolded states of five polypeptides in atomistic detail at room temperature, we report here a result that seems at odds with the mean-structure hypothesis. Specifically, the average inter-residue distances in the collapsed unfolded structures agree well with the statistics of the ideal random-flight chain with link length of 3.8 A (the length of one amino acid). A possible resolution of this apparent contradiction is offered by the observation that the inter-residue distances in a typical alpha-helix over short stretches are close to the average distances in an ideal random-flight chain.

    View details for DOI 10.1038/nsb995

    View details for Web of Science ID 000186229100016

    View details for PubMedID 14555998

  • Equilibrium free energies from nonequilibrium measurements using maximum-likelihood methods PHYSICAL REVIEW LETTERS Shirts, M. R., Bair, E., Hooker, G., Pande, V. S. 2003; 91 (14)


    We present a maximum likelihood argument for the Bennett acceptance ratio method, and derive a simple formula for the variance of free energy estimates generated using this method. This derivation of the acceptance ratio method, using a form of logistic regression, a common statistical technique, allows us to shed additional light on the underlying physical and statistical properties of the method. For example, we demonstrate that the acceptance ratio method yields the lowest variance for any estimator of the free energy which is unbiased in the limit of large numbers of measurements.

    View details for DOI 10.1103/PhysRevLett.91.140601

    View details for Web of Science ID 000185719500002

    View details for PubMedID 14611511

  • Extremely precise free energy calculations of amino acid side chain analogs: Comparison of common molecular mechanics force fields for proteins JOURNAL OF CHEMICAL PHYSICS Shirts, M. R., Pitera, J. W., Swope, W. C., Pande, V. S. 2003; 119 (11): 5740-5761

    View details for DOI 10.1063/1.1587119

    View details for Web of Science ID 000185025000049

  • Sequence optimization for native state stability determines the evolution and folding kinetics of a small protein JOURNAL OF MOLECULAR BIOLOGY Larson, S. M., Pande, V. S. 2003; 332 (1): 275-286


    Investigating the relative importance of protein stability, function, and folding kinetics in driving protein evolution has long been hindered by the fact that we can only compare modern natural proteins, the products of the very process we seek to understand, to each other, with no external references or baselines. Through a large-scale all-atom simulation of protein evolution, we have created a large diverse alignment of SH3 domain sequences which have been selected only for native state stability, with no other influencing factors. Although the average pairwise identity between computationally evolved and natural sequences is only 17%, the residue frequency distributions of the computationally evolved sequences are similar to natural SH3 sequences at 86% of the positions in the domain, suggesting that optimization for the native state structure has dominated the evolution of natural SH3 domains. Additionally, the positions which play a consistent role in the transition state of three well-characterized SH3 domains (by phi-value analysis) are structurally optimized for the native state, and vice versa. Indeed, we see a specific and significant correlation between sequence optimization for native state stability and conservation of transition state structure.

    View details for DOI 10.1016/S0022-2836(03)00832-5

    View details for Web of Science ID 000185034400024

    View details for PubMedID 12946364

  • Solvent viscosity dependence of the folding rate of a small protein: Distributed computing study JOURNAL OF COMPUTATIONAL CHEMISTRY Zagrovic, B., Pande, V. 2003; 24 (12): 1432-1436


    By using distributed computing techniques and a supercluster of more than 20,000 processors we simulated folding of a 20-residue Trp Cage miniprotein in atomistic detail with implicit GB/SA solvent at a variety of solvent viscosities (gamma). This allowed us to analyze the dependence of folding rates on viscosity. In particular, we focused on the low-viscosity regime (values below the viscosity of water). In accordance with Kramers' theory, we observe approximately linear dependence of the folding rate on 1/gamma for values from 1-10(-1)x that of water viscosity. However, for the regime between 10(-4)-10(-1)x that of water viscosity we observe power-law dependence of the form k approximately gamma(-1/5). These results suggest that estimating folding rates from molecular simulations run at low viscosity under the assumption of linear dependence of rate on inverse viscosity may lead to erroneous results.

    View details for DOI 10.1002/jcc.10297

    View details for Web of Science ID 000184515400008

    View details for PubMedID 12868108

  • Insights into nucleic acid conformational dynamics from massively parallel stochastic simulations BIOPHYSICAL JOURNAL Sorin, E. J., Rhee, Y. M., Nakatani, B. J., Pande, V. S. 2003; 85 (2): 790-803


    The helical hairpin is one of the most ubiquitous and elementary secondary structural motifs in nucleic acids, capable of serving functional roles and participating in long-range tertiary contacts. Yet the self-assembly of these structures has not been well-characterized at the atomic level. With this in mind, the dynamics of nucleic acid hairpin formation and disruption have been studied using a novel computational tool: large-scale, parallel, atomistic molecular dynamics simulation employing an inhomogeneous distributed computer consisting of more than 40,000 processors. Using multiple methodologies, over 500 micro s of atomistic simulation time has been collected for a large ensemble of hairpins (sequence 5'-GGGC[GCAA]GCCU-3'), allowing characterization of rare events not previously observable in simulation. From uncoupled ensemble dynamics simulations in unperturbed folding conditions, we report on 1), competing pathways between the folded and unfolded regions of the conformational space; 2), observed nonnative stacking and basepairing traps; and 3), a helix unwinding-rewinding mode that is differentiated from the unfolding and folding dynamics. A heterogeneous transition state ensemble is characterized structurally through calculations of conformer-specific folding probabilities and a multiplexed replica exchange stochastic dynamics algorithm is used to derive an approximate folding landscape. A comparison between the observed folding mechanism and that of a peptide beta-hairpin analog suggests that although native topology defines the character of the folding landscape, the statistical weighting of potential folding pathways is determined by the chemical nature of the polymer.

    View details for Web of Science ID 000184428300010

    View details for PubMedID 12885628

  • Increased detection of structural templates using alignments of designed sequences PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS Larson, S. M., Garg, A., Desjarlais, J. R., Pande, V. S. 2003; 51 (3): 390-396


    Protein structure prediction by comparative modeling benefits greatly from the use of multiple sequence alignment information to improve the accuracy of structural template identification and the alignment of target sequences to structural templates. Unfortunately, this benefit is limited to those protein sequences for which at least several natural sequence homologues exist. We show here that the use of large diverse alignments of computationally designed protein sequences confers many of the same benefits as natural sequences in identifying structural templates for comparative modeling targets. A large-scale massively parallelized application of an all-atom protein design algorithm, including a simple model of peptide backbone flexibility, has allowed us to generate 500 diverse, non-native, high-quality sequences for each of 264 protein structures in our test set. PSI-BLAST searches using the sequence profiles generated from the designed sequences ("reverse" BLAST searches) give near-perfect accuracy in identifying true structural homologues of the parent structure, with 54% coverage. In 41 of 49 genomes scanned using reverse BLAST searches, at least one novel structural template (not found by the standard method of PSI-BLAST against PDB) is identified. Further improvements in coverage, through optimizing the scoring function used to design sequences and continued application to new protein structures beyond the test set, will allow this method to mature into a useful strategy for identifying distantly related structural templates.

    View details for DOI 10.1002/prot.10346

    View details for Web of Science ID 000182587800007

    View details for PubMedID 12696050

  • Meeting halfway on the bridge between protein folding theory and experiment PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Pande, V. S. 2003; 100 (7): 3555-3556

    View details for DOI 10.1073/pnas.0830965100

    View details for Web of Science ID 000182058400005

    View details for PubMedID 12657736

  • Cooperativity, smooth energy landscapes and the origins of topology-dependent protein folding rates JOURNAL OF MOLECULAR BIOLOGY Jewett, A. I., Pande, V. S., Plaxco, K. W. 2003; 326 (1): 247-253


    The relative folding rates of simple, single-domain proteins, proteins whose folding energy landscapes are smooth, are highly dispersed and strongly correlated with native-state topology. In contrast, the relative folding rates of small, Gō-potential lattice polymers, which also exhibit smooth energy landscapes, are poorly dispersed and insignificantly correlated with native-state topology. Here, we investigate this discrepancy in light of a recent, quantitative theory of two-state folding kinetics, the topomer search model. This model stipulates that the topology-dependence of two-state folding rates is a direct consequence of the extraordinarily cooperative equilibrium folding of simple proteins. We demonstrate that traditional Gō polymers lack the extreme cooperativity that characterizes the folding of naturally occurring, two-state proteins and confirm that the folding rates of a diverse set of Gō 27-mers are poorly dispersed and effectively uncorrelated with native state topology. Upon modestly increasing the cooperativity of the Gō-potential, however, significantly increased dispersion and strongly topology-dependent kinetics are observed. These results support previous arguments that the cooperative folding of simple, single-domain proteins gives rise to their topology-dependent folding rates. We speculate that this cooperativity, and thus, indirectly, the topology-rate relationship, may have arisen in order to generate the smooth energetic landscapes upon which rapid folding can occur.

    View details for DOI 10.1016/S0022-2836(02)01356-6

    View details for Web of Science ID 000180997600021

    View details for PubMedID 12547206

  • Multiplexed-replica exchange molecular dynamics method for protein folding simulation BIOPHYSICAL JOURNAL Rhee, Y. M., Pande, V. S. 2003; 84 (2): 775-786


    Simulating protein folding thermodynamics starting purely from a protein sequence is a grand challenge of computational biology. Here, we present an algorithm to calculate a canonical distribution from molecular dynamics simulation of protein folding. This algorithm is based on the replica exchange method where the kinetic trapping problem is overcome by exchanging noninteracting replicas simulated at different temperatures. Our algorithm uses multiplexed-replicas with a number of independent molecular dynamics runs at each temperature. Exchanges of configurations between these multiplexed-replicas are also tried, rendering the algorithm applicable to large-scale distributed computing (i.e., highly heterogeneous parallel computers with processors having different computational power). We demonstrate the enhanced sampling of this algorithm by simulating the folding thermodynamics of a 23 amino acid miniprotein. We show that better convergence is achieved compared to constant temperature molecular dynamics simulation, with an efficient scaling to large number of computer processors. Indeed, this enhanced sampling results in (to our knowledge) the first example of a replica exchange algorithm that samples a folded structure starting from a completely unfolded state.

    View details for Web of Science ID 000183123700006

    View details for PubMedID 12547762

  • Atomistic protein folding simulations on the submillisecond time scale using worldwide distributed computing BIOPOLYMERS Pande, V. S., Baker, I., Chapman, J., Elmer, S. P., Khaliq, S., Larson, S. M., Rhee, Y. M., Shirts, M. R., Snow, C. D., Sorin, E. J., Zagrovic, B. 2003; 68 (1): 91-109


    Atomistic simulations of protein folding have the potential to be a great complement to experimental studies, but have been severely limited by the time scales accessible with current computer hardware and algorithms. By employing a worldwide distributed computing network of tens of thousands of PCs and algorithms designed to efficiently utilize this new many-processor, highly heterogeneous, loosely coupled distributed computing paradigm, we have been able to simulate hundreds of microseconds of atomistic molecular dynamics. This has allowed us to directly simulate the folding mechanism and to accurately predict the folding rate of several fast-folding proteins and polymers, including a nonbiological helix, polypeptide alpha-helices, a beta-hairpin, and a three-helix bundle protein from the villin headpiece. Our results demonstrate that one can reach the time scales needed to simulate fast folding using distributed computing, and that potential sets used to describe interatomic interactions are sufficiently accurate to reach the folded state with experimentally validated rates, at least for small proteins.

    View details for DOI 10.1002/bip.10219

    View details for Web of Science ID 000180210000008

    View details for PubMedID 12579582

  • Computational simulation of lipid bilayer reorientation at gaps 2nd International Computational Systems Bioinformatics Conference Kasson, P. M., Pande, V. S. IEEE COMPUTER SOC. 2003: 464–466
  • The Trp cage: Folding kinetics and unfolded state topology via molecular dynamics simulations JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Snow, C. D., Zagrovic, B., Pande, V. S. 2002; 124 (49): 14548-14549


    Using over 75 mus of molecular dynamics simulation, we have generated several thousand folding simulations of the 20-residue Trp cage at experimental temperature and solvent viscosity. A total of 116 independent folding simulations reach RMSDcalpha values below 3 A RMSDcalpha, some as close as 1.4 A RMSDcalpha. We estimate a folding time of 5.5+/-3.5 mus, a rate that is in reasonable agreement with experimental kinetics. Finally, we characterize both the folded and unfolded ensemble under native conditions and note that the average topology of the unfolded ensemble is very similar to the topology of the native state.

    View details for DOI 10.1021/ja028604l

    View details for Web of Science ID 000179661000021

    View details for PubMedID 12465960

  • Thoroughly sampling sequence space: Large-scale protein design of structural ensembles PROTEIN SCIENCE Larson, S. M., England, J. L., Desjarlais, J. R., Pande, V. S. 2002; 11 (12): 2804-2813


    Modeling the inherent flexibility of the protein backbone as part of computational protein design is necessary to capture the behavior of real proteins and is a prerequisite for the accurate exploration of protein sequence space. We present the results of a broad exploration of sequence space, with backbone flexibility, through a novel approach: large-scale protein design to structural ensembles. A distributed computing architecture has allowed us to generate hundreds of thousands of diverse sequences for a set of 253 naturally occurring proteins, allowing exciting insights into the nature of protein sequence space. Designing to a structural ensemble produces a much greater diversity of sequences than previous studies have reported, and homology searches using profiles derived from the designed sequences against the Protein Data Bank show that the relevance and quality of the sequences is not diminished. The designed sequences have greater overall diversity than corresponding natural sequence alignments, and no direct correlations are seen between the diversity of natural sequence alignments and the diversity of the corresponding designed sequences. For structures in the same fold, the sequence entropies of the designed sequences cluster together tightly. This tight clustering of sequence entropies within a fold and the separation of sequence entropy distributions for different folds suggest that the diversity of designed sequences is primarily determined by a structure's overall fold, and that the designability principle postulated from studies of simple models holds in real proteins. This has important implications for experimental protein design and engineering, as well as providing insight into protein evolution.

    View details for DOI 10.1110/ps.0203902

    View details for Web of Science ID 000179352000005

    View details for PubMedID 12441379

  • Simulation of folding of a small alpha-helical protein in atomistic detail using worldwide-distributed computing JOURNAL OF MOLECULAR BIOLOGY Zagrovic, B., Snow, C. D., Shirts, M. R., Pande, V. S. 2002; 323 (5): 927-937


    By employing thousands of PCs and new worldwide-distributed computing techniques, we have simulated in atomistic detail the folding of a fast-folding 36-residue alpha-helical protein from the villin headpiece. The total simulated time exceeds 300 micros, orders of magnitude more than previous simulations of a molecule of this size. Starting from an extended state, we obtained an ensemble of folded structures, which is on average 1.7A and 1.9A away from the native state in C(alpha) distance-based root-mean-square deviation (dRMS) and C(beta) dRMS sense, respectively. The folding mechanism of villin is most consistent with the hydrophobic collapse view of folding: the molecule collapses non-specifically very quickly ( approximately 20ns), which greatly reduces the size of the conformational space that needs to be explored in search of the native state. The conformational search in the collapsed state appears to be rate-limited by the formation of the aromatic core: in a significant fraction of our simulations, the C-terminal phenylalanine residue packs improperly with the rest of the hydrophobic core. We suggest that the breaking of this interaction may be the rate-determining step in the course of folding. On the basis of our simulations we estimate the folding rate of villin to be approximately 5micros. By analyzing the average features of the folded ensemble obtained by simulation, we see that the mean folded structure is more similar to the native fold than any individual folded structure. This finding highlights the need for simulating ensembles of molecules and averaging the results in an experiment-like fashion if meaningful comparison between simulation and experiment is to be attempted. Moreover, our results demonstrate that (1) the computational methodology exists to simulate the multi-microsecond regime using distributed computing and (2) that potential sets used to describe interatomic interactions may be sufficiently accurate to reach the folded state, at least for small proteins. We conclude with a comparison between our results and current protein-folding theory.

    View details for DOI 10.1016/S0022-2836(02)00997-X

    View details for Web of Science ID 000179308500012

    View details for PubMedID 12417204

  • Absolute comparison of simulated and experimental protein-folding dynamics NATURE Snow, C. D., Nguyen, N., Pande, V. S., Gruebele, M. 2002; 420 (6911): 102-106


    Protein folding is difficult to simulate with classical molecular dynamics. Secondary structure motifs such as alpha-helices and beta-hairpins can form in 0.1-10 micros (ref. 1), whereas small proteins have been shown to fold completely in tens of microseconds. The longest folding simulation to date is a single 1- micro s simulation of the villin headpiece; however, such single runs may miss many features of the folding process as it is a heterogeneous reaction involving an ensemble of transition states. Here, we have used a distributed computing implementation to produce tens of thousands of 5-20-ns trajectories (700 micros) to simulate mutants of the designed mini-protein BBA5. The fast relaxation dynamics these predict were compared with the results of laser temperature-jump experiments. Our computational predictions are in excellent agreement with the experimentally determined mean folding times and equilibrium constants. The rapid folding of BBA5 is due to the swift formation of secondary structure. The convergence of experimentally and computationally accessible timescales will allow the comparison of absolute quantities characterizing in vitro and in silico (computed) protein folding.

    View details for DOI 10.1038/nature01160

    View details for Web of Science ID 000179068100045

    View details for PubMedID 12422224

  • Native-like mean structure in the unfolded ensemble of small proteins JOURNAL OF MOLECULAR BIOLOGY Zagrovic, B., Snow, C. D., Khaliq, S., Shirts, M. R., Pande, V. S. 2002; 323 (1): 153-164


    The nature of the unfolded state plays a great role in our understanding of proteins. However, accurately studying the unfolded state with computer simulation is difficult, due to its complexity and the great deal of sampling required. Using a supercluster of over 10,000 processors we have performed close to 800 micros of molecular dynamics simulation in atomistic detail of the folded and unfolded states of three polypeptides from a range of structural classes: the all-alpha villin headpiece molecule, the beta hairpin tryptophan zipper, and a designed alpha-beta zinc finger mimic. A comparison between the folded and the unfolded ensembles reveals that, even though virtually none of the individual members of the unfolded ensemble exhibits native-like features, the mean unfolded structure (averaged over the entire unfolded ensemble) has a native-like geometry. This suggests several novel implications for protein folding and structure prediction as well as new interpretations for experiments which find structure in ensemble-averaged measurements.

    View details for DOI 10.1016/S0022-2836(02)00888-4

    View details for Web of Science ID 000178737200015

    View details for PubMedID 12368107

  • RNA simulations: Probing hairpin unfolding and the dynamics of a GNRA tetraloop JOURNAL OF MOLECULAR BIOLOGY Sorin, E. J., Engelhardt, M. A., Herschlag, D., Pande, V. S. 2002; 317 (4): 493-506


    Simulations of an RNA hairpin containing a GNRA tetraloop were conducted to allow the characterization of its secondary structure formation and dynamics. Ten 10 ns trajectories of the folded hairpin 5'-GGGC[GCAA]GCCU-3' were generated using stochastic dynamics and the GB/SA implicit solvent model at 300 K. Overall, we find the stem to be a very stable subunit of this molecule, whereas multiple loop conformations and transitions between them were observed. These trajectories strongly suggest that extension of the C6 base away from the loop occurs cooperatively with an N-type-->S-type sugar pucker conversion in that residue and that similar pucker transitions are necessary to stabilize other looped-out bases. In addition, a short-lived conformer with an extended fourth loop residue (A8) lacking this stabilizing 2'-endo pucker mode was observed. Results of thermal perturbation at 400 K support this model of loop dynamics. Unfolding trajectories were produced using this same methodology at temperatures of 500 to 700 K. The observed unfolding events display three-state behavior kinetically (including native, globular, and unfolded populations) and, based on these observations, we propose a folding mechanism that consists of three distinct events: (i) collapse of the random unfolded structure and sampling of the globular state; (ii) passage into the folded region of configurational space as stem base-pairs form and gain helicity; and (iii) attainment of proper loop geometry and organization of loop pairing and stacking interactions. These results are considered in the context of current experimental knowledge of this and similar nucleic acid hairpins.

    View details for DOI 10.1006/jmbi.2002.5447

    View details for Web of Science ID 000175070100002

    View details for PubMedID 11955005

  • Rapid compaction during RNA folding PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Russell, R., Millettt, I. S., Tate, M. W., Kwok, L. W., Nakatani, B., Gruner, S. M., Mochrie, S. G., Pande, V., Doniach, S., Herschlag, D., Pollack, L. 2002; 99 (7): 4266-4271


    We have used small angle x-ray scattering and computer simulations with a coarse-grained model to provide a time-resolved picture of the global folding process of the Tetrahymena group I RNA over a time window of more than five orders of magnitude. A substantial phase of compaction is observed on the low millisecond timescale, and the overall compaction and global shape changes are largely complete within one second, earlier than any known tertiary contacts are formed. This finding indicates that the RNA forms a nonspecifically collapsed intermediate and then searches for its tertiary contacts within a highly restricted subset of conformational space. The collapsed intermediate early in folding of this RNA is grossly akin to molten globule intermediates in protein folding.

    View details for DOI 10.1073/pnas.072589599

    View details for Web of Science ID 000174856000027

    View details for PubMedID 11929997

  • beta-hairpin folding simulations in atomistic detail using an implicit solvent model JOURNAL OF MOLECULAR BIOLOGY Zagrovic, B., Sorin, E. J., Pande, V. 2001; 313 (1): 151-169


    We have used distributed computing techniques and a supercluster of thousands of computer processors to study folding of the C-terminal beta-hairpin from protein G in atomistic detail using the GB/SA implicit solvent model at 300 K. We have simulated a total of nearly 38 micros of folding time and obtained eight complete and independent folding trajectories. Starting from an extended state, we observe relaxation to an unfolded state characterized by non-specific, temporary hydrogen bonding. This is followed by the appearance of interactions between hydrophobic residues that stabilize a bent intermediate. Final formation of the complete hydrophobic core occurs cooperatively at the same time that the final hydrogen bonding pattern appears. The folded hairpin structures we observe all contain a closely packed hydrophobic core and proper beta-sheet backbone dihedral angles, but they differ in backbone hydrogen bonding pattern. We show that this is consistent with the existing experimental data on the hairpin alone in solution. Our analysis also reveals short-lived semi-helical intermediates which define a thermodynamic trap. Our results are consistent with a three-state mechanism with a single rate-limiting step in which a varying final hydrogen bond pattern is apparent, and semi-helical off-pathway intermediates may appear early in the folding process. We include details of the ensemble dynamics methodology and a discussion of our achievements using this new computational device for studying dynamics at the atomic level.

    View details for Web of Science ID 000171816800012

    View details for PubMedID 11601853

  • Mathematical analysis of coupled parallel simulations PHYSICAL REVIEW LETTERS Shirts, M. R., Pande, V. S. 2001; 86 (22): 4983-4987


    A set of parallel replicas of a single simulation can be statistically coupled to closely approximate long trajectories. In many cases, this produces nearly linear speedup over a single simulation ( M times faster with M simulations), rendering previously intractable problems within reach of large computer clusters. Interestingly, by varying the coupling of the parallel simulations, it is possible in some systems to obtain greater than linear speedup. The methods are generalizable to any search algorithm with long residence times in intermediate states.

    View details for Web of Science ID 000169013600001

    View details for PubMedID 11384401

  • A new twist on the helix-coil transition: A non-biological helix with protein-like intermediates and traps JOURNAL OF PHYSICAL CHEMISTRY B Elmer, S., Pande, V. S. 2001; 105 (2): 482-485

    View details for DOI 10.1021/jp0019761

    View details for Web of Science ID 000166490900018

  • Mechanical unfolding of a beta-hairpin using molecular dynamics BIOPHYSICAL JOURNAL Bryant, Z., Pande, V. S., Rokhsar, D. S. 2000; 78 (2): 584-589


    Single-molecule mechanical unfolding experiments have the potential to provide insights into the details of protein folding pathways. To investigate the relationship between force-extension unfolding curves and microscopic events, we performed molecular dynamics simulations of the mechanical unfolding of the C-terminal hairpin of protein G. We have studied the dependence of the unfolding pathway on pulling speed, cantilever stiffness, and attachment points. Under conditions that generate low forces, the unfolding trajectory mimics the untethered, thermally accessible pathway previously proposed based on high-temperature studies. In this stepwise pathway, complete breakdown of backbone hydrogen bonds precedes dissociation of the hydrophobic cluster. Under more extreme conditions, the cluster and hydrogen bonds break simultaneously. Transitions between folding intermediates can be identified in our simulations as features of the calculated force-extension curves.

    View details for Web of Science ID 000085249300004

    View details for PubMedID 10653773

  • Heteropolymer freezing and design: Towards physical models of protein folding REVIEWS OF MODERN PHYSICS Pande, V. S., Grosberg, A. Y., Tanaka, T. 2000; 72 (1): 259-314
  • On the role of conformational geometry in protein folding JOURNAL OF CHEMICAL PHYSICS Du, R., Pande, V. S., Grosberg, A. Y., Tanaka, T., Shakhnovich, E. 1999; 111 (22): 10375-10380
  • Simulation of biomimetic recognition between polymers and surfaces PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Golumbfskie, A. J., Pande, V. S., Chakraborty, A. K. 1999; 96 (21): 11707-11712


    Many biological processes, such as transmembrane signaling and pathogen-host interactions, are initiated by a protein recognizing a specific pattern of binding sites on part of a membrane or cell surface. By recognition, we imply that the polymer quickly finds and then adsorbs strongly on the pattern-matched region and not on others. The development of synthetic systems that can mimic such recognition between polymers and surfaces could have significant impact on advanced applications such as the development of sensors, molecular-scale separation processes, and synthetic viral inhibition agents. Attempting to affect recognition in synthetic systems by copying the detailed chemistries to which nature has been led over millenia of evolution does not seem practical for most applications. This leads us to the following question: Are there any universal strategies that can affect recognition between polymers and surfaces? Such generic strategies may be easier to implement in abiotic applications. We describe results that suggest that biomimetic recognition between synthetic polymers and surfaces is possible by exploiting certain generic strategies, and we elucidate the kinetic mechanisms by which this occurs. Our results suggest convenient model systems for experimental studies of dynamics in free energy landscapes characteristic of frustrated systems.

    View details for Web of Science ID 000083166800007

    View details for PubMedID 10518514

  • Molecular dynamics simulations of unfolding and refolding of a beta-hairpin fragment of protein G PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Pande, V. S., Rokhsar, D. S. 1999; 96 (16): 9062-9067


    We have studied the unfolding and refolding pathway of a beta-hairpin fragment of protein G by using molecular dynamics. Although this fragment is small, it possesses several of the qualities ascribed to small proteins: cooperatively formed beta-sheet secondary structure and a hydrophobic "core" of packed side chains. At high temperatures, we find that the beta-hairpin unfolds through a series of sudden, discrete conformational changes. These changes occur between states that are identified with the folded state, a pair of partially unfolded kinetic intermediates, and the unfolded state. To study refolding at low temperatures, we perform a series of short simulations starting from the transition states of the discrete transitions determined by the unfolding simulations.

    View details for Web of Science ID 000081835500053

    View details for PubMedID 10430895

  • Folding pathway of a lattice model for proteins PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Pande, V. S., Rokhsar, D. S. 1999; 96 (4): 1273-1278


    The folding of a protein-like heteropolymer is studied by using direct simulation of a lattice model that folds rapidly to a well-defined "native" structure. The details of each molecular folding event depend on the random initial conformation as well as the random thermal fluctuations of the polymer. By analyzing the statistical properties of hundreds of folding events, a classical folding "pathway" for such a polymer is found that includes partially folded, on-pathway intermediates that are shown to be metastable equilibrium states of the polymer. These results are discussed in the context of the "classical" and "new" views of folding.

    View details for Web of Science ID 000078698400022

    View details for PubMedID 9990014

  • Is the molten globule a third phase of proteins? PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Pande, V. S., Rokhsar, D. S. 1998; 95 (4): 1490-1494


    The equilibrium properties of proteins are studied by Monte Carlo simulation of two simplified models of protein-like heteropolymers. These models emphasize the polymeric entropy of the fluctuating polypeptide chain. Our calculations suggest a generic phase diagram that contains a thermodynamically distinct "molten globule" state in addition to a rigid native state and a nontrivial unfolded state. The roles of side-chain packing and loop entropy are discussed.

    View details for Web of Science ID 000072115900025

    View details for PubMedID 9465042

  • Pathways for protein folding: is a new view needed? CURRENT OPINION IN STRUCTURAL BIOLOGY Pande, V. S., Grosberg, A. Y., Tanaka, T., Rokhsar, D. S. 1998; 8 (1): 68-79


    Theoretical studies using simplified models of proteins have shed light on the general heteropolymeric aspects of the folding problem. Recent work has emphasized the statistical aspects of folding pathways. In particular, progress has been made in characterizing the ensemble of transition state conformations and elucidating the role of intermediates. These advances suggest a reconciliation between the new ensemble approaches and the classical view of a folding pathway.

    View details for Web of Science ID 000072384400010

    View details for PubMedID 9519298

  • Freezing of compact random heteropolymers with correlated sequence fluctuations JOURNAL OF CHEMICAL PHYSICS Chakraborty, A. K., Shakhnovich, E. I., Pande, V. S. 1998; 108 (4): 1683-1687
  • On the transition coordinate for protein folding JOURNAL OF CHEMICAL PHYSICS Du, R., Pande, V. S., Grosberg, A. Y., Tanaka, T., Shakhnovich, E. S. 1998; 108 (1): 334-350
  • Statistical mechanics of simple models of protein folding and design BIOPHYSICAL JOURNAL Pande, V. S., Grosberg, A. Y., Tanaka, T. 1997; 73 (6): 3192-3210


    It is now believed that the primary equilibrium aspects of simple models of protein folding are understood theoretically. However, current theories often resort to rather heavy mathematics to overcome some technical difficulties inherent in the problem or start from a phenomenological model. To this end, we take a new approach in this pedagogical review of the statistical mechanics of protein folding. The benefit of our approach is a drastic mathematical simplification of the theory, without resort to any new approximations or phenomenological prescriptions. Indeed, the results we obtain agree precisely with previous calculations. Because of this simplification, we are able to present here a thorough and self contained treatment of the problem. Topics discussed include the statistical mechanics of the random energy model (REM), tests of the validity of REM as a model for heteropolymer freezing, freezing transition of random sequences, phase diagram of designed ("minimally frustrated") sequences, and the degree to which errors in the interactions employed in simulations of either folding and design can still lead to correct folding behavior.

    View details for Web of Science ID A1997YJ67300031

    View details for PubMedID 9414231

  • Molecular dynamics study of the structure organization in a strongly coupled chain of charged particles PHYSICAL REVIEW E Tanaka, M., Grosberg, A. Y., Pande, V. S., Tanaka, T. 1997; 56 (5): 5798-5808
  • Thermodynamics of the coil to frozen globule transition in heteropolymers JOURNAL OF CHEMICAL PHYSICS Pande, V. S., Grosberg, A. Y., Tanaka, T. 1997; 107 (13): 5118-5124
  • How to create polymers with protein-like capabilities: A theoretical suggestion 16th Annual International Conference of the Center-for-Nonlinear-Studies on Landscape Paradigms in Physics and Biology - Concepts, Structures and Dynamics Pande, V. S., Grosberg, A. Y., Tanaka, T. ELSEVIER SCIENCE BV. 1997: 316–21
  • On the theory of folding kinetics for short proteins FOLDING & DESIGN Pande, V. S., Grosberg, A. Y., Tanaka, T. 1997; 2 (2): 109-114


    Recent data have suggested two principles that are central to the work we describe here. First, proteins are the result of evolutionary 'sequence selection' to optimize the energy of the native state. Second, the overlap with the native state is a qualitatively suitable reaction coordinate for modeling folding kinetics. The former principle is bolder and better established.Employing only these two principles, we have constructed a non-phenomenological, correlated energy landscape theory that predicts single barrier protein folding kinetics. Moreover, we are able to analytically describe the nature of the free energetic barrier between the denatured and native states of a protein and to detail the nature of folding kinetics for short proteins. Our model predicts Hammond behavior and also describes how mutations can lead to drastic differences in folding times.We find that folding and unfolding kinetics can be characterized by a single thermodynamic parameter and, moreover, that Monte Carlo simulation data on folding and unfolding rates with different temperatures and mutations collapse with this characterization. Our results also delineate a regime in which kinetics may proceed via a single unique nucleus.

    View details for Web of Science ID A1997WW18500006

    View details for PubMedID 9135983

  • Freezing transition of compact polyampholytes PHYSICAL REVIEW LETTERS Pande, V. S., Grosberg, A. Y., JOERG, C., Kardar, M., Tanaka, T. 1996; 77 (17): 3565-3568
  • Is heteropolymer freezing well described by the random energy model? PHYSICAL REVIEW LETTERS Pande, V. S., Grosberg, A. Y., JOERG, C., Tanaka, T. 1996; 76 (21): 3987-3990
  • Polymer gels that can recognize and recover molecules General Discussion on Gels Tanaka, T., Wang, C. N., Pande, V., Grosberg, A. Y., English, A., Masamune, S., Gold, H., Levy, R., King, K. ROYAL SOC CHEMISTRY. 1995: 201–206


    We suggest a procedure to synthesize polymers with characteristics similar to those observed in globular proteins: renaturability and the existence of an "active site" capable of specifically recognizing a given target molecule. This procedure is investigated by computer simulation, which finds a yield of up to 65%. We believe that, in principle, this scheme can be realized in vitro. The applicability of this approach as a model of prebiotic synthesis in vivo is also discussed.

    View details for Web of Science ID A1994PY29400128

    View details for PubMedID 7809158



    The sequences, or primary structures, of existing biopolymers--in particular, proteins--are believed to be a product of evolution. Are the sequences random? If not, what is the character of this nonrandomness? To explore the statistics of protein sequences, we use the idea of mapping the sequence onto the trajectory of a random walk, originally proposed by Peng et al. [Peng, C.-K., Buldyrev, S. V., Goldberger, A. L., Havlin, S., Sciortino, F., Simons, M. & Stanley, H. E. (1992) Nature (London) 356, 168-170] in their analysis of DNA sequences. Using three different mappings, corresponding to three basic physical interactions between amino acids, we found pronounced deviations from pure randomness, and these deviations seem directed toward minimization of the energy of the three-dimensional structure. We consider this result as evidence for a physically driven stage of evolution.

    View details for Web of Science ID A1994PY29400127

    View details for PubMedID 7809157

  • PHASE-DIAGRAM OF IMPRINTED COPOLYMERS JOURNAL DE PHYSIQUE II Pande, V. S., Grosberg, A. Y., Tanaka, T. 1994; 4 (10): 1771-1784