Protein Structure Accuracy Prediction with Deep Learning and Its Application to Structure Prediction and Design
Understanding the rules of protein structure folding has always been one of the central goals in computational biology. Deep learning is gaining popularity in protein machine learning due to its ability to learn complex functions on large amounts of protein geometry data. To help understand the rule...
Saved in:
Main Author: | |
---|---|
Format: | Dissertation |
Language: | English |
Published: |
ProQuest Dissertations & Theses
01-01-2022
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Understanding the rules of protein structure folding has always been one of the central goals in computational biology. Deep learning is gaining popularity in protein machine learning due to its ability to learn complex functions on large amounts of protein geometry data. To help understand the rules of protein folding better, we developed neural networks (DeepAccNet and Pluto) that estimate the error in protein models. In other words, these networks estimate how much a computationally modeled protein structure deviates from its experimentally determined conformation. Approximately two million conformations from 21000 protein sequences located at different local energy minima with a large diversity of errors were sampled and used for training. The network uses 3D convolutions to evaluate local atomic environments followed by 2D convolutions to provide their global contexts and outperforms other methods that similarly predict the accuracy of protein structure models. Overall accuracy predictions for X-ray and cryoEM structures in the PDB correlate with their resolution. The network should be broadly helpful in assessing the accuracy of both predicted structure models and experimentally determined structures and identifying specific regions likely to be in error. The DeepAccNet methods were selected as top-performing methods for the estimation of model accuracy (EMA) category in CASP14. We extended the accuracy prediction models for proteins to more general chemistry by training graph neural networks on a wide variety of protein and non-protein datasets. We showed that the resulting framework (GAAP) successfully estimates the accuracy of non-protein molecules, such as peptides and Protein-DNA complexes. Our results illustrate how deep learning can impact the efficiency and accuracy of large-scale simulations for both modeling and designing of molecules. |
---|---|
ISBN: | 9798426801363 |