Inference on differences between classes using cluster-specific contrasts of mixed effects

Ng, Shu Kay, McLachlan, Geoffrey J., Wang, Kui, Nagymanyoki, Zoltan, Liu, Shubai and Ng, Shu-Wing (2015) Inference on differences between classes using cluster-specific contrasts of mixed effects. Biostatistics, 16 1: 98-112. doi:10.1093/biostatistics/kxu028

Author Ng, Shu Kay
McLachlan, Geoffrey J.
Wang, Kui
Nagymanyoki, Zoltan
Liu, Shubai
Ng, Shu-Wing
Title Inference on differences between classes using cluster-specific contrasts of mixed effects
Journal name Biostatistics   Check publisher's open access policy
ISSN 1465-4644
Publication date 2015-01
Year available 2014
Sub-type Article (original research)
DOI 10.1093/biostatistics/kxu028
Open Access Status
Volume 16
Issue 1
Start page 98
End page 112
Total pages 15
Place of publication Oxford, United Kingdom
Publisher Oxford University Press
Collection year 2015
Language eng
Subject 2700 Medicine
2613 Statistics and Probability
1804 Statistics, Probability and Uncertainty
Formatted abstract
The detection of differentially expressed (DE) genes, that is, genes whose expression levels vary between two or more classes representing different experimental conditions (say, diseases), is one of the most commonly studied problems in bioinformatics. For example, the identification of DE genes between distinct disease phenotypes is an important first step in understanding and developing treatment drugs for the disease. We present a novel approach to the problem of detecting DE genes that is based on a test statistic formed as a weighted (normalized) cluster-specific contrast in the mixed effects of the mixture model used in the first instance to cluster the gene profiles into a manageable number of clusters. The key factor in the formation of our test statistic is the use of gene-specific mixed effects in the cluster-specific contrast. It thus means that the (soft) assignment of a given gene to a cluster is not crucial. This is because in addition to class differences between the (estimated) fixed effects terms for a cluster, gene-specific class differences also contribute to the cluster-specific contributions to the final form of the test statistic. The proposed test statistic can be used where the primary aim is to rank the genes in order of evidence against the null hypothesis of no DE. We also show how a P-value can be calculated for each gene for use in multiple hypothesis testing where the intent is to control the false discovery rate (FDR) at some desired level. With the use of publicly available and simulated datasets, we show that the proposed contrast-based approach outperforms other methods commonly used for the detection of DE genes both in a ranking context with lower proportion of false discoveries and in a multiple hypothesis testing context with higher power for a specified level of the FDR.
Keyword Contrast
Differential expression
Mixture model
Random effects modeling
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Published online ahead of print 23 June 2014.

Document type: Journal Article
Sub-type: Article (original research)
Collections: School of Mathematics and Physics
Official 2015 Collection
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 1 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 1 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sun, 22 Feb 2015, 00:45:37 EST by System User on behalf of School of Mathematics & Physics