DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products
Microbial natural products represent a rich resource of evolved chemistry that forms the basis for the majority of pharmacotherapeutics. Ribosomally synthesized and posttranslationally modified peptides (RiPPs) are a particularly interesting class of natural products noted for their unique mode of b...
Saved in:
Published in: | Proceedings of the National Academy of Sciences - PNAS Vol. 117; no. 1; pp. 371 - 380 |
---|---|
Main Authors: | , , , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
United States
National Academy of Sciences
07-01-2020
|
Series: | PNAS Plus |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Microbial natural products represent a rich resource of evolved chemistry that forms the basis for the majority of pharmacotherapeutics. Ribosomally synthesized and posttranslationally modified peptides (RiPPs) are a particularly interesting class of natural products noted for their unique mode of biosynthesis and biological activities. Analyses of sequenced microbial genomes have revealed an enormous number of biosynthetic loci encoding RiPPs but whose products remain cryptic. In parallel, analyses of bacterial metabolomes typically assign chemical structures to only a minority of detected metabolites. Aligning these 2 disparate sources of data could provide a comprehensive strategy for natural product discovery. Here we present DeepRiPP, an integrated genomic and metabolomic platform that employs machine learning to automate the selective discovery and isolation of novel RiPPs. DeepRiPP includes 3 modules. The first, NLPPrecursor, identifies RiPPs independent of genomic context and neighboring biosynthetic genes. The second module, BARLEY, prioritizes loci that encode novel compounds, while the third, CLAMS, automates the isolation of their corresponding products from complex bacterial extracts. DeepRiPP pinpoints target metabolites using large-scale comparative metabolomics analysis across a database of 10,498 extracts generated from 463 strains. We apply the DeepRiPP platform to expand the landscape of novel RiPPs encoded within sequenced genomes and to discover 3 novel RiPPs, whose structures are exactly as predicted by our platform. By building on advances in machine learning technologies, DeepRiPP integrates genomic and metabolomic data to guide the isolation of novel RiPPs in an automated manner. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Author contributions: N.J.M., W.K.M., C.A.D., M.A.S., M.J.C., and N.A.M. designed research; N.J.M., W.K.M., C.A.D., M.J.C., Haoxin Li, K.D., M.G., and C.J. performed research; N.J.M., W.K.M., C.A.D., M.A.S., C.J., and N.A.M. analyzed data; and N.J.M., W.K.M., M.A.S., and N.A.M. wrote the paper. Edited by Hongzhe Li, University of Pennsylvania School of Medicine, and accepted by Editorial Board Member Bin Yu November 15, 2019 (received for review January 28, 2019) 1N.J.M. and W.K.M. contributed equally to this work. |
ISSN: | 0027-8424 1091-6490 |
DOI: | 10.1073/pnas.1901493116 |