Natural products subsets: Generation and characterization
Natural products are attractive for drug discovery applications because of their distinctive chemical structures, such as an overall large fraction of sp3 carbon atoms, chiral centers (both features associated with structural complexity), large chemical scaffolds, and diversity of functional groups....
Saved in:
Published in: | Artificial intelligence in the life sciences Vol. 3; p. 100066 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier B.V
01-12-2023
Elsevier |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Natural products are attractive for drug discovery applications because of their distinctive chemical structures, such as an overall large fraction of sp3 carbon atoms, chiral centers (both features associated with structural complexity), large chemical scaffolds, and diversity of functional groups. Furthermore, natural products are used in de novo design and have inspired the development of pseudo-natural products using generative models. Public databases such as the Collection of Open NatUral ProdUcTs and the Universal Natural Product database (UNPD) are rich sources of structures to be used in generative models and other applications. In this work, we report the selection and characterization of the most diverse compounds of natural products from the UNPD using the MaxMin algorithm. The subsets generated with 14,994, 7,497, and 4,998 compounds are publicly available at https://github.com/DIFACQUIM/Natural-products-subsets-generation. We anticipate that the subsets will be particularly useful in building generative models based on natural products by research groups, particularly those with limited access to extensive supercomputer resources.
[Display omitted] |
---|---|
ISSN: | 2667-3185 2667-3185 |
DOI: | 10.1016/j.ailsci.2023.100066 |