MultiPhen: Joint model of multiple phenotypes can increase discovery in GWAS

O'Reilly, Paul F., Hoggart, Clive J., Pomyen, Yotsawat, Calboli, Federico C. F., Elliott, Paul, Jarvelin, Marjo- Riitta and Coin, Lachlan J. M. (2012) MultiPhen: Joint model of multiple phenotypes can increase discovery in GWAS. PLoS ONE, 7 5: e34861.1-e34861.12. doi:10.1371/journal.pone.0034861


Author O'Reilly, Paul F.
Hoggart, Clive J.
Pomyen, Yotsawat
Calboli, Federico C. F.
Elliott, Paul
Jarvelin, Marjo- Riitta
Coin, Lachlan J. M.
Title MultiPhen: Joint model of multiple phenotypes can increase discovery in GWAS
Journal name PLoS ONE   Check publisher's open access policy
ISSN 1932-6203
Publication date 2012-05
Year available 2012
Sub-type Article (original research)
DOI 10.1371/journal.pone.0034861
Open Access Status DOI
Volume 7
Issue 5
Start page e34861.1
End page e34861.12
Total pages 12
Place of publication San Francisco, United States
Publisher Public Library of Science (PLoS)
Language eng
Abstract The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes.
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ

Document type: Journal Article
Sub-type: Article (original research)
Collection: Institute for Molecular Bioscience - Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 70 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 77 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Fri, 24 Jan 2014, 19:01:39 EST by System User on behalf of Institute for Molecular Bioscience