Motivation: DNA methylation is an epigenetic change occurring in genomic CpG sequences that contribute to the regulation of gene transcription both in normal and malignant cells. Next-generation sequencing has been used to characterize DNA methylation status at the genome scale, but suffers from high sequencing cost in the case of whole-genome bisulfite sequencing, or from reduced resolution (inability to precisely define which of the CpGs are methylated) with capture-based techniques.
Results: Here we present a computational method that computes nucleotide-resolution methylation values from capture-based data by incorporating fragment length profiles into a model of methylation analysis. We demonstrate that it compares favorably with nucleotide-resolution bisulfite sequencing and has better predictive power with respect to a reference than window-based methods, often used for enrichment data. The described method was used to produce the methylation data used in tandem with gene expression to produce a novel and clinically significant gene signature in acute myeloid leukemia. In addition, we introduce a complementary statistical method that uses this nucleotide-resolution methylation data for detection of differentially methylated features.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.