NGS‐Integrator: A Tool for Combining Information from Multiple Genome‐Wide NGS Data Tracks Using Minimum Bayes Factors

Genome‐wide studies that generate multiple high‐throughput next‐generation sequencing (NGS) datasets to identify genomic DNA elements for gene transcription require data integration methods to minimize complexity and false‐positive findings. This can involve integration of multiple genome‐wide data...

Full description

Saved in:
Bibliographic Details
Published in:The FASEB journal Vol. 33; no. S1; p. 637.2
Main Authors: Jung, Hyun Jun, Wen, Bronte, Chen, Lihe, Saeed, Fahad, Knepper, Mark A
Format: Journal Article
Language:English
Published: The Federation of American Societies for Experimental Biology 01-04-2019
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Genome‐wide studies that generate multiple high‐throughput next‐generation sequencing (NGS) datasets to identify genomic DNA elements for gene transcription require data integration methods to minimize complexity and false‐positive findings. This can involve integration of multiple genome‐wide data generated from same type or different types of high‐throughput NGS techniques. Since several strategies to integrate multiple genome‐wide NGS datasets based on peak calling tools have been developed, these conventional methods are typically applied to individual replicates and not the aggregate data from multiple data. NGS‐integrator, a Java‐based tool, integrates genome‐wide NGS datasets into a single data track for a genome browser based on minimum Bayes Factor (MBF) calculated from the signal‐to‐noise ratio. NGS‐integrator consists of two elements, “Calculator” and “Integrator”. The “Calculator” element calculates the complement of MBF at each genomic position, estimating the median of the noise values across a sliding window straddling the position at which signal value is being calculated. This calculation assumes that the features being detected are relatively sparsely represented along the genome, allowing the median to be representative of the true noise level. The “Integrator” element calculates the joint probability between the multiple NGS data tracks based on the complement of MBF calculated with ‘Calculator’. We used NGS‐integrator to integrate data from three types of data generated in our laboratory in cultured mpkCCD kidney cells, viz. ChIP‐Seq for the transcription factor ELF1, ATAC‐Seq, and ChIP‐Seq for the histone modification H3K27Ac and RNA polymerase II. Support or Funding Information The work was supported by the Division of Intramural Research, National Heart, Lung, and Blood Institute projects ZIA‐HL001285 (to MAK) and ZIA‐HL006129 (to MAK). BW was supported by the NHLBI Summer Internship Program (June–August 2018). This is from the Experimental Biology 2019 Meeting. There is no full text article associated with this published in The FASEB Journal.
ISSN:0892-6638
1530-6860
DOI:10.1096/fasebj.2019.33.1_supplement.637.2