Bayesian hidden Markov model for DNA sequence segmentation: A prior sensitivity analysis

Nur, Darfiana, Allingham, David, Rousseau, Judith, Mengersen, Kerrie L. and McVinish, Ross (2009) Bayesian hidden Markov model for DNA sequence segmentation: A prior sensitivity analysis. Computational Statistics and Data Analysis, 53 5: 1873-1882. doi:10.1016/j.csda.2008.07.007

Author Nur, Darfiana
Allingham, David
Rousseau, Judith
Mengersen, Kerrie L.
McVinish, Ross
Title Bayesian hidden Markov model for DNA sequence segmentation: A prior sensitivity analysis
Journal name Computational Statistics and Data Analysis   Check publisher's open access policy
ISSN 0167-9473
Publication date 2009-03-15
Year available 2008
Sub-type Article (original research)
DOI 10.1016/j.csda.2008.07.007
Volume 53
Issue 5
Start page 1873
End page 1882
Total pages 10
Editor David B. Allison
Peter M. Visscher
Guilherme J.M. Rosa
Christopher I. Amos
Place of publication Amsterdam
Publisher Elsevier B.V.
Language eng
Subject C1
970101 Expanding Knowledge in the Mathematical Sciences
010401 Applied Statistics
Formatted abstract
The sensitivity to the specification of the prior in a hidden Markov model describing homogeneous segments of DNA sequences is considered. An intron from the chimpanzee α-fetoprotein gene, which plays an important role in embryonic development in mammals, is analysed. Three main aims are considered: (i) to assess the sensitivity to prior specification in Bayesian hidden Markov models for DNA sequence segmentation; (ii) to examine the impact of replacing the standard Dirichlet prior with a mixture Dirichlet prior; and (iii) to propose and illustrate a more comprehensive approach to sensitivity analysis, using importance sampling. It is obtained that (i) the posterior estimates obtained under a Bayesian hidden Markov model are indeed sensitive to the specification of the prior distributions; (ii) compared with the standard Dirichlet prior, the mixture Dirichlet prior is more flexible, less sensitive to the choice of hyperparameters and less constraining in the analysis, thus improving posterior estimates; and (iii) importance sampling was computationally feasible, fast and effective in allowing a richer sensitivity analysis.

Q-Index Code C1
Q-Index Status Provisional Code

Document type: Journal Article
Sub-type: Article (original research)
Collection: School of Mathematics and Physics
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 8 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 10 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 02 Mar 2010, 16:07:19 EST by Kay Mackie on behalf of School of Mathematics & Physics