Combining signal and sequence to detect RNA polymerase initiation in ATAC-seq data

The assay for transposase-accessible chromatin followed by sequencing (ATAC-seq) is an inexpensive protocol for measuring open chromatin regions. ATAC-seq is also relatively simple and requires fewer cells than many other high-throughput sequencing protocols. Therefore, it is tractable in numerous s...

Full description

Saved in:
Bibliographic Details
Published in:PloS one Vol. 15; no. 4; p. e0232332
Main Authors: Tripodi, Ignacio J, Chowdhury, Murad, Gruca, Margaret, Dowell, Robin D
Format: Journal Article
Language:English
Published: United States Public Library of Science 30-04-2020
Public Library of Science (PLoS)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The assay for transposase-accessible chromatin followed by sequencing (ATAC-seq) is an inexpensive protocol for measuring open chromatin regions. ATAC-seq is also relatively simple and requires fewer cells than many other high-throughput sequencing protocols. Therefore, it is tractable in numerous settings where other high throughput assays are challenging to impossible. Hence it is important to understand the limits of what can be inferred from ATAC-seq data. In this work, we leverage ATAC-seq to predict the presence of nascent transcription. Nascent transcription assays are the current gold standard for identifying regions of active transcription, including markers for functional transcription factor (TF) binding. We combine mapped short reads from ATAC-seq with the underlying peak sequence, to determine regions of active transcription genome-wide. We show that a hybrid signal/sequence representation classified using recurrent neural networks (RNNs) can identify these regions across different cell types.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Competing Interests: One author (RDD) of this publication is a founder and scientific advisor for Arpeggio Biosciences. Dr. Dowell is not employed by Arpeggio but rather consults occasionally with the company. We also note that no aspect of this work was funded by or influenced in any way by the company. This work is funded entirely by NIH R01 GM125871. No aspect of our funding alters our adherence to PLOS ONE policies on sharing data and materials.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0232332