Calculation of IBD probabilities with dense SNP or sequence data

Keith, Jonathan M., McRae, Allan, Duffy, David, Mengersen, Kerrie and Visscher, Peter M. (2008) Calculation of IBD probabilities with dense SNP or sequence data. Genetic Epidemiology, 32 6: 513-519. doi:10.1002/gepi.20324

Author Keith, Jonathan M.
McRae, Allan
Duffy, David
Mengersen, Kerrie
Visscher, Peter M.
Title Calculation of IBD probabilities with dense SNP or sequence data
Journal name Genetic Epidemiology   Check publisher's open access policy
ISSN 0741-0395
Publication date 2008-09-01
Year available 2008
Sub-type Article (original research)
DOI 10.1002/gepi.20324
Open Access Status Not yet assessed
Volume 32
Issue 6
Start page 513
End page 519
Total pages 7
Place of publication Hoboken, NJ, United States
Publisher John Wiley & Sons
Language eng
Abstract The probabilities that two individuals share 0, 1, or 2 alleles identical by descent (IBD) at a given genotyped marker locus are quantities of fundamental importance for disease gene and quantitative trait mapping and in family-based tests of association. Until recently, genotyped markers were sufficiently sparse that founder haplotypes could be modelled as having been drawn from a population in linkage equilibrium for the purpose of estimating IBD probabilities. However, with the advent of high-throughput single nucleotide polymorphism genotyping assays, this is no longer a reasonable assumption. Indeed, the imminent arrival of individual sequencing will enable high-density single nucleotide polymorphism genotyping on a scale for which current algorithms are not equipped. In this paper, we present a simple new model in which founder haplotypes are modelled as a Markov chain. Another important innovation is that genotyping errors are explicitly incorporated into the model. We compare results obtained using the new model to those obtained using the popular genetic linkage analysis package Merlin, with and without using the cluster model of linkage disequilibrium that is incorporated into that program. We find that the new model results in accuracy approaching that of Merlin with haplotype blocks, but achieves this with orders of magnitude faster run times. Moreover, the new algorithm scales linearly with number of markers, irrespective of density, whereas Merlin scales supralinearly. We also confirm a previous finding that ignoring linkage disequilibrium in founder haplotypes can cause errors in the calculation of IBD probabilities.
Keyword Identity by descent
Linkage disequilibrium
Single nucleotide polymorphism
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ

Document type: Journal Article
Sub-type: Article (original research)
Collection: UQ Diamantina Institute Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 4 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 4 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Wed, 03 Apr 2013, 00:06:04 EST by Allan Mcrae on behalf of UQ Diamantina Institute