Segmenting eukaryotic genomes with the generalized Gibbs sampler

Keith, Jonathan M. (2006) Segmenting eukaryotic genomes with the generalized Gibbs sampler. Journal of Computational Biology, 13 7: 1369-1383. doi:10.1089/cmb.2006.13.1369

Author Keith, Jonathan M.
Title Segmenting eukaryotic genomes with the generalized Gibbs sampler
Journal name Journal of Computational Biology   Check publisher's open access policy
ISSN 1066-5277
Publication date 2006
Sub-type Article (original research)
DOI 10.1089/cmb.2006.13.1369
Volume 13
Issue 7
Start page 1369
End page 1383
Total pages 15
Editor D. M. Waterman
S. Istrail
Place of publication New Rochelle
Publisher Mary Ann Liebert Inc
Collection year 2006
Language eng
Subject 01 Mathematical Sciences
Abstract Eukaryotic genomes display segmental patterns of variation in various properties, including GC content and degree of evolutionary conservation. DNA segmentation algorithms are aimed at identifying statistically significant boundaries between such segments. Such algorithms may provide a means of discovering new classes of functional elements in eukaryotic genomes. This paper presents a model and an algorithm for Bayesian DNA segmentation and considers the feasibility of using it to segment whole eukaryotic genomes. The algorithm is tested on a range of simulated and real DNA sequences, and the following conclusions are drawn. Firstly, the algorithm correctly identifies non-segmented sequence, and can thus be used to reject the null hypothesis of uniformity in the property of interest. Secondly, estimates of the number and locations of change-points produced by the algorithm are robust to variations in algorithm parameters and initial starting conditions and correspond to real features in the data. Thirdly, the algorithm is successfully used to segment human chromosome 1 according to GC content, thus demonstrating the feasibility of Bayesian segmentation of eukaryotic genomes. The software described in this paper is available from the author's website ( to uqjkeith/) or upon request to the author.
Keyword Eukaryotic Genomes
Genome Segmentation
Gc Content
Functional Non-coding Rna
Bayesian Modelling
Markov Chain Monte Carlo
Generalized Gibbs Sampler
Mathematics, Interdisciplinary Applications
Biochemical Research Methods
Biotechnology & Applied Microbiology
Computer Science, Interdisciplinary Applications
Statistics & Probability
Dna-sequence Segmentation
Isochore Chromosome Maps
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Unknown

Document type: Journal Article
Sub-type: Article (original research)
Collections: School of Mathematics and Physics
2007 Higher Education Research Data Collection
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 18 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 22 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Wed, 15 Aug 2007, 09:12:53 EST