Disease association tests by inferring ancestral haplotypes using a hidden markov model

Su, Shu-Yi, Balding, David J. and Coin, Lachlan J. M. (2008) Disease association tests by inferring ancestral haplotypes using a hidden markov model. Bioinformatics, 24 7: 972-978. doi:10.1093/bioinformatics/btn071

Author Su, Shu-Yi
Balding, David J.
Coin, Lachlan J. M.
Title Disease association tests by inferring ancestral haplotypes using a hidden markov model
Journal name Bioinformatics   Check publisher's open access policy
ISSN 1367-4803
Publication date 2008-01-01
Year available 2008
Sub-type Article (original research)
DOI 10.1093/bioinformatics/btn071
Open Access Status Not yet assessed
Volume 24
Issue 7
Start page 972
End page 978
Total pages 7
Place of publication Oxford, United Kingdom
Publisher Oxford University Press
Language eng
Formatted abstract
Motivation: Most genome-wide association studies rely on single nucleotide polymorphism (SNP) analyses to identify causal loci. The increased stringency required for genome-wide analyses (with per-SNP significance threshold typically ≈ 107) means that many real signals will be missed. Thus it is still highly relevant to develop methods with improved power at low type I error. Haplotype-based methods provide a promising approach; however, they suffer from statistical problems such as abundance of rare haplotypes and ambiguity in defining haplotype block boundaries.

Results: We have developed an ancestral haplotype clustering (AncesHC) association method which addresses many of these problems. It can be applied to biallelic or multiallelic markers typed in haploid, diploid or multiploid organisms, and also handles missing genotypes. Our model is free from the assumption of a rigid block structure but recognizes a block-like structure if it exists in the data. We employ a Hidden Markov Model (HMM) to cluster the haplotypes into groups of predicted common ancestral origin. We then test each cluster for association with disease by comparing the numbers of cases and controls with 0, 1 and 2 chromosomes in the cluster. We demonstrate the power of this approach by simulation of case-control status under a range of disease models for 1500 outcrossed mice originating from eight inbred lines. Our results suggest that AncesHC has substantially more power than single-SNP analyses to detect disease association, and is also more powerful than the cladistic haplotype clustering method CLADHC.
Keyword Biochemical Research Methods
Biotechnology & Applied Microbiology
Computer Science, Interdisciplinary Applications
Mathematical & Computational Biology
Statistics & Probability
Biochemistry & Molecular Biology
Biotechnology & Applied Microbiology
Computer Science
Mathematical & Computational Biology
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ

Document type: Journal Article
Sub-type: Article (original research)
Collection: Institute for Molecular Bioscience - Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 16 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 19 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sat, 25 Jan 2014, 04:53:48 EST by System User on behalf of Institute for Molecular Bioscience