BoostMe accurately predicts DNA methylation values in whole-genome bisulfite sequencing of multiple human tissues

Bisulfite sequencing is widely employed to study the role of DNA methylation in disease; however, the data suffer from biases due to coverage depth variability. Imputation of methylation values at low-coverage sites may mitigate these biases while also identifying important genomic features associat...

Full description

Saved in:
Bibliographic Details
Published in:BMC genomics Vol. 19; no. 1; p. 390
Main Authors: Zou, Luli S, Erdos, Michael R, Taylor, D Leland, Chines, Peter S, Varshney, Arushi, Parker, Stephen C J, Collins, Francis S, Didion, John P
Format: Journal Article
Language:English
Published: England BioMed Central Ltd 23-05-2018
BioMed Central
BMC
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Bisulfite sequencing is widely employed to study the role of DNA methylation in disease; however, the data suffer from biases due to coverage depth variability. Imputation of methylation values at low-coverage sites may mitigate these biases while also identifying important genomic features associated with predictive power. Here we describe BoostMe, a method for imputing low-quality DNA methylation estimates within whole-genome bisulfite sequencing (WGBS) data. BoostMe uses a gradient boosting algorithm, XGBoost, and leverages information from multiple samples for prediction. We find that BoostMe outperforms existing algorithms in speed and accuracy when applied to WGBS of human tissues. Furthermore, we show that imputation improves concordance between WGBS and the MethylationEPIC array at low WGBS depth, suggesting improved WGBS accuracy after imputation. Our findings support the use of BoostMe as a preprocessing step for WGBS analysis.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1471-2164
1471-2164
DOI:10.1186/s12864-018-4766-y