Inferring combined CNV/SNP haplotypes from genotype data

Su, Shu-Yi, Asher, Julian E., Jarvelin, Marjo-Riita, Froguel, Phillipe, Blakemore, Alexandra I. F., Balding, David J. and Coin, Lachlan J. M. (2010) Inferring combined CNV/SNP haplotypes from genotype data. Bioinformatics, 26 11: 1437-1445. doi:10.1093/bioinformatics/btq157

Author Su, Shu-Yi
Asher, Julian E.
Jarvelin, Marjo-Riita
Froguel, Phillipe
Blakemore, Alexandra I. F.
Balding, David J.
Coin, Lachlan J. M.
Title Inferring combined CNV/SNP haplotypes from genotype data
Journal name Bioinformatics   Check publisher's open access policy
ISSN 1367-4803
Publication date 2010-04-20
Year available 2010
Sub-type Article (original research)
DOI 10.1093/bioinformatics/btq157
Open Access Status Not yet assessed
Volume 26
Issue 11
Start page 1437
End page 1445
Total pages 9
Place of publication Oxford, United Kingdom
Publisher Oxford University Press
Language eng
Formatted abstract
Motivation: Copy number variations (CNVs) are increasingly recognized as an substantial source of individual genetic variation, and hence there is a growing interest in investigating the evolutionary history of CNVs as well as their impact on complex disease susceptibility. CNV/SNP haplotypes are critical for this research, but although many methods have been proposed for inferring integer copy number, few have been designed for inferring CNV haplotypic phase and none of these are applicable at genome-wide scale. Here, we present a method for inferring missing CNV genotypes, predicting CNV allelic configuration and for inferring CNV haplotypic phase from SNP/CNV genotype data. Our method, implemented in the software polyHap v2.0, is based on a hidden Markov model, which models the joint haplotype structure between CNVs and SNPs. Thus, haplotypic phase of CNVs and SNPs are inferred simultaneously. A sampling algorithm is employed to obtain a measure of confidence/credibility of each estimate.

Results: We generated diploid phase-known CNV-SNP genotype datasets by pairing male X chromosome CNV-SNP haplotypes. We show that polyHap provides accurate estimates of missing CNV genotypes, allelic configuration and CNV haplotypic phase on these datasets. We applied our method to a non-simulated dataset-a region on Chromosome 2 encompassing a short deletion. The results confirm that polyHap's accuracy extends to real-life datasets. 
Keyword Biochemical Research Methods
Biotechnology & Applied Microbiology
Computer Science, Interdisciplinary Applications
Mathematical & Computational Biology
Statistics & Probability
Biochemistry & Molecular Biology
Biotechnology & Applied Microbiology
Computer Science
Mathematical & Computational Biology
Q-Index Code C1
Q-Index Status Provisional Code
Grant ID 104781
Institutional Status Non-UQ

Document type: Journal Article
Sub-type: Article (original research)
Collection: Institute for Molecular Bioscience - Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 24 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 25 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sat, 25 Jan 2014, 04:48:59 EST by System User on behalf of Institute for Molecular Bioscience