View on GitHub

Computational inference of H3K4me3 and H3K27ac domain length

Data and experimental code for analyses of epigenetic markers domain lengths.

Download this project as a .zip file Download this project as a tar.gz file

Computational inference of H3K4me3 and H3K27ac domain length

Julian Zubek, Michael Stitzel, Duygu Ucar*, Dariusz Plewczynski*

* Corresponding authors

Email: Duygu.ucar<...>jax.org (DU), dariuszplewczynski<...>uw.edu.pl (DP)

Abstract

Recent epigenomic studies have shown that the length of a DNA region covered by an epigenetic mark is not just a byproduct of the assaying technologies and has functional implications for that locus. For example, expanded regions of DNA sequences that are marked by enhancer-specific histone modifications, such as acetylation of histone H3 lysine 27 (H3K27ac) domains coincide with cell-specific enhancers, known as super or stretch enhancers. Similarly, promoters of genes critical for cell-specific functions are marked by expanded H3K4me3 domains in the cognate cell type, and these can span DNA regions from 4-5kb up to 40-50kb in length. These expanded H3K4me3 domains are known as buffer domains or super promoters. To ask what correlates with—and potentially regulates—the length of deposition patterns for these two important histone marks, we built Random Forest regression models and computationally identified genomic and epigenomic patterns in multiple ENCODE cell lines and human pancreatic islets. We found that certain epigenetic marks and transcription factors, conserved across different cell types, explain the variability of the length of H3K4me3 and H3K27ac marks, which implies that the lengths of these two epigenetic marks are tightly regulated.