Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17

Drug molecules consist of a few tens of atoms connected by covalent bonds. How many such molecules are possible in total and what is their structure? This question is of pressing interest in medicinal chemistry to help solve the problems of drug potency, selectivity, and toxicity and reduce attritio...

Full description

Saved in:
Bibliographic Details
Published in:Journal of chemical information and modeling Vol. 52; no. 11; pp. 2864 - 2875
Main Authors: Ruddigkeit, Lars, van Deursen, Ruud, Blum, Lorenz C, Reymond, Jean-Louis
Format: Journal Article
Language:English
Published: Washington, DC American Chemical Society 26-11-2012
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Drug molecules consist of a few tens of atoms connected by covalent bonds. How many such molecules are possible in total and what is their structure? This question is of pressing interest in medicinal chemistry to help solve the problems of drug potency, selectivity, and toxicity and reduce attrition rates by pointing to new molecular series. To better define the unknown chemical space, we have enumerated 166.4 billion molecules of up to 17 atoms of C, N, O, S, and halogens forming the chemical universe database GDB-17, covering a size range containing many drugs and typical for lead compounds. GDB-17 contains millions of isomers of known drugs, including analogs with high shape similarity to the parent drug. Compared to known molecules in PubChem, GDB-17 molecules are much richer in nonaromatic heterocycles, quaternary centers, and stereoisomers, densely populate the third dimension in shape space, and represent many more scaffold types.
ISSN:1549-9596
1549-960X
DOI:10.1021/ci300415d