PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications
Computational methods and recently modern machine learning methods have played a key role in structure-based drug design. Though several benchmarking datasets are available for machine learning applications in virtual screening, accurate prediction of binding affinity for a protein-ligand complex re...
Saved in:
Published in: | Scientific data Vol. 9; no. 1; pp. 548 - 10 |
---|---|
Main Authors: | , , , , , , , , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
London
Nature Publishing Group UK
07-09-2022
Nature Publishing Group Nature Portfolio |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Computational methods and recently modern machine learning methods have played a key role in structure-based drug design. Though several benchmarking datasets are available for machine learning applications in virtual screening, accurate prediction of binding affinity for a protein-ligand complex remains a major challenge. New datasets that allow for the development of models for predicting binding affinities better than the state-of-the-art scoring functions are important. For the first time, we have developed a dataset, PLAS-5k comprised of 5000 protein-ligand complexes chosen from PDB database. The dataset consists of binding affinities along with energy components like electrostatic, van der Waals, polar and non-polar solvation energy calculated from molecular dynamics simulations using MMPBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) method. The calculated binding affinities outperformed docking scores and showed a good correlation with the available experimental values. The availability of energy components may enable optimization of desired components during machine learning-based drug design. Further, OnionNet model has been retrained on PLAS-5k dataset and is provided as a baseline for the prediction of binding affinities.
Measurement(s)
Binding Affinity
Technology Type(s)
Molecular dynamics simulation/MM-PBSA
Factor Type(s)
3D-protein structures
Sample Characteristic - Organism
NA
Sample Characteristic - Environment
NA
Sample Characteristic - Location
NA |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23 |
ISSN: | 2052-4463 2052-4463 |
DOI: | 10.1038/s41597-022-01631-9 |