Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects

Wang, Kui, Ng, Shu Kay and McLachlan, Geoffrey J. (2012) Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects. Bmc Bioinformatics, 13 1: 300.1-300.14. doi:10.1186/1471-2105-13-300

Author Wang, Kui
Ng, Shu Kay
McLachlan, Geoffrey J.
Title Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
Journal name Bmc Bioinformatics   Check publisher's open access policy
ISSN 1471-2105
Publication date 2012-11
Year available 2012
Sub-type Article (original research)
DOI 10.1186/1471-2105-13-300
Open Access Status DOI
Volume 13
Issue 1
Start page 300.1
End page 300.14
Total pages 14
Place of publication London, United Kingdom
Publisher BioMed Central
Collection year 2013
Language eng
Formatted abstract
Background: Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models.
Results: We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases.
Conclusions: Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data.
Keyword Time course data
Mixtures of linear mixed models
Autoregressive random effects
EMMIX WIRE procedure
Course Microarray Experiments
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: School of Mathematics and Physics
Official 2013 Collection
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 3 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 5 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sun, 17 Mar 2013, 00:55:24 EST by System User on behalf of School of Mathematics & Physics