Imputation of genotypes from low density (50,000 markers) to high density (700,000 markers) of cows from research herds in Europe, North America, and Australasia using 2 reference populations

Pryce, J. E., Johnston, J., Hayes, B. J., Sahana, G., Weigel, K. A., McParland, S., Spurlock, D., Krattenmacher, N., Spelman, R. J., Wall, E. and Calus, M. P. L. (2014) Imputation of genotypes from low density (50,000 markers) to high density (700,000 markers) of cows from research herds in Europe, North America, and Australasia using 2 reference populations. Journal of Dairy Science, 97 3: 1799-1811. doi:10.3168/jds.2013-7368


Author Pryce, J. E.
Johnston, J.
Hayes, B. J.
Sahana, G.
Weigel, K. A.
McParland, S.
Spurlock, D.
Krattenmacher, N.
Spelman, R. J.
Wall, E.
Calus, M. P. L.
Title Imputation of genotypes from low density (50,000 markers) to high density (700,000 markers) of cows from research herds in Europe, North America, and Australasia using 2 reference populations
Journal name Journal of Dairy Science   Check publisher's open access policy
ISSN 0022-0302
1525-3198
Publication date 2014-03
Sub-type Article (original research)
DOI 10.3168/jds.2013-7368
Open Access Status Not yet assessed
Volume 97
Issue 3
Start page 1799
End page 1811
Total pages 13
Place of publication New York, United States
Publisher Elsevier
Language eng
Abstract Combining data from research herds may be advantageous, especially for difficult or expensive-to-measure traits (such as dry matter intake). Cows in research herds are often genotyped using low-density single nucleotide polymorphism (SNP) panels. However, the precision of quantitative trait loci detection in genome-wide association studies and the accuracy of genomic selection may increase when the low-density genotypes are imputed to higher density. Genotype data were available from 10 research herds: 5 from Europe [Denmark, Germany, Ireland, the Netherlands, and the United Kingdom (UK)], 2 from Australasia (Australia and New Zealand), and 3 from North America (Canada and the United States). Heifers from the Australian and New Zealand research herds were already genotyped at high density (approximately 700,000 SNP). The remaining genotypes were imputed from around 50,000 SNP to 700,000 using 2 reference populations. Although it was not possible to use a combined reference population, which would probably result in the highest accuracies of imputation, differences arising from using 2 high-density reference populations on imputing 50,000-marker genotypes of 583 animals (from the UK) were quantified. The European genotypes (n. = 4,097) were imputed as 1 data set, using a reference population of 3,150 that included genotypes from 835 Australian and 1,053 New Zealand females, with the remainder being males. Imputation was undertaken using population-wide linkage disequilibrium with no family information exploited. The UK animals were also included in the North American data set (n. = 1,579) that was imputed to high density using a reference population of 2,018 bulls. After editing, 591,213 genotypes on 5,999 animals from 10 research herds remained. The correlation between imputed allele frequencies of the 2 imputed data sets was high (>0.98) and even stronger (>0.99) for the UK animals that were part of each imputation data set. For the UK genotypes, 2.2% were imputed differently in the 2 high-density reference data sets used. Only 0.025% of these were homozygous switches. The number of discordant SNP was lower for animals that had sires that were genotyped. Discordant imputed SNP genotypes were most common when a large difference existed in allele frequency between the 2 imputed genotype data sets. For SNP that had ≥20% discordant genotypes, the difference between imputed data sets of allele frequencies of the UK (imputed) genotypes was 0.07, whereas the difference in allele frequencies of the (reference) high-density genotypes was 0.30. In fact, regions existed across the genome where the frequency of discordant SNP was higher. For example, on chromosome 10 (centered on 520,948 bp), 52 SNP (out of a total of 103 SNP) had ≥20% discordant SNP. Four hundred and eight SNP had more than 20% discordant genotypes and were removed from the final set of imputed genotypes. We concluded that both discordance of imputed SNP genotypes and differences in allele frequencies, after imputation using different reference data sets, may be used to identify and remove poorly imputed SNP.
Keyword High-density genotyping
Imputation
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ

Document type: Journal Article
Sub-type: Article (original research)
Collection: Queensland Alliance for Agriculture and Food Innovation
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 12 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 12 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Fri, 05 Aug 2016, 10:15:31 EST by System User on behalf of Learning and Research Services (UQ Library)