Predicting nucleosome positioning using multiple evidence tracks

Sheila M. Reynolds, Zhiping Weng, Jeff A. Bilmes and William Stafford Noble

Research in Computational Molecular Biology. Lecture Notes in Computer Science, 2010, Volume 6044/2010, pp. 441-455.


We describe a probabilistic model, implemented as a dynamic Bayesian network, that can be used to predict nucleosome positioning along a chromosome based on one or more genomic input tracks containing position-specific information (evidence). Previous models have either made predictions based on primary DNA sequence alone, or have been used to infer nucleosome positions from experimental data. Our framework permits the combination of these two distinct types of information. We show how this flexible framework can be used to make predictions based on either sequence-model scores or experimental data alone, or by using the two in combination to interpret the experimental data and fill in gaps. The model output represents the posterior probability, at each position along the chromosome, that a nucleosome core overlaps that position, given the evidence. This posterior probability is computed by integrating the information contained in the input evidence tracks along the entire input sequence, and fitting the evidence to a simple grammar of alternating nucleosome cores and linkers. In addition to providing a novel mechanism for the prediction of nucleosome positioning from arbitrary heterogeneous data sources, this framework is also applicable to other genomic segmentation tasks in which local scores are available from models or from data that can be interpreted as defining a probability assignment over labels at that position. The ability to combine sequence-based predictions and data from experimental assays is a significant and novel contribution to the ongoing research regarding the primary structure of chromatin and its effects upon gene regulation.