Accelerating molecular simulations of proteins using Bayesian inference on weak information

Atomistic molecular dynamics (MD) simulations of protein molecules are too computationally expensive to predict most native structures from amino acid sequences. Here, we integrate “weak” external knowledge into folding simulations to predict protein structures, given their sequence. For example, we...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the National Academy of Sciences - PNAS Vol. 112; no. 38; pp. 11846 - 11851
Main Authors: Perez, Alberto, MacCallum, Justin L., Dill, Ken A.
Format: Journal Article
Language:English
Published: United States National Academy of Sciences 22-09-2015
National Acad Sciences
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Atomistic molecular dynamics (MD) simulations of protein molecules are too computationally expensive to predict most native structures from amino acid sequences. Here, we integrate “weak” external knowledge into folding simulations to predict protein structures, given their sequence. For example, we instruct the computer “to form a hydrophobic core,” “to form good secondary structures,” or “to seek a compact state.” This kind of information has been too combinatoric, nonspecific, and vague to help guide MD simulations before. Within atomistic replica-exchange molecular dynamics (REMD), we develop a statistical mechanical framework, modeling using limited data with coarse physical insight(s) (MELD + CPI), for harnessing weak information. As a test, we apply MELD + CPI to predict the native structures of 20 small proteins. MELD + CPI samples to within less than 3.2 Å from native for all 20 and correctly chooses the native structures (<4 Å) for 15 of them, including ubiquitin, a millisecond folder. MELD + CPI is up to five orders of magnitude faster than brute-force MD, satisfies detailed balance, and should scale well to larger proteins. MELD + CPI may be useful where physics-based simulations are needed to study protein mechanisms and populations and where we have some heuristic or coarse physical knowledge about states of interest.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewers included: J.L., University of Illinois at Chicago.
1A.P. and J.L.M. contributed equally to this work.
Author contributions: A.P., J.L.M., and K.A.D. designed research; A.P. performed research; A.P. and J.L.M. analyzed data; and A.P., J.L.M., and K.A.D. wrote the paper.
Contributed by Ken A. Dill, August 7, 2015 (sent for review June 27, 2015; reviewed by Jie Liang)
ISSN:0027-8424
1091-6490
DOI:10.1073/pnas.1515561112