EMMIXuskew: An R package for Fitting Mixtures of Multivariate Skew t distributions via the EM algorithm

Lee S.X. and McLachlan G.J. (2013) EMMIXuskew: An R package for Fitting Mixtures of Multivariate Skew t distributions via the EM algorithm. Journal of Statistical Software, 55 12: 1-22.

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
UQ319142_OA.pdf Full text (open access) application/pdf 833.39KB 53
Author Lee S.X.
McLachlan G.J.
Title EMMIXuskew: An R package for Fitting Mixtures of Multivariate Skew t distributions via the EM algorithm
Journal name Journal of Statistical Software   Check publisher's open access policy
ISSN 1548-7660
Publication date 2013-11
Year available 2013
Sub-type Article (original research)
Open Access Status File (Publisher version)
Volume 55
Issue 12
Start page 1
End page 22
Total pages 22
Place of publication Alexandria, United States
Publisher American Statistical Association
Collection year 2014
Language eng
Subject 1712 Software
2613 Statistics and Probability
1804 Statistics, Probability and Uncertainty
Formatted abstract
This paper describes an algorithm for fitting finite mixtures of unrestricted Multivariate Skew t (FM-uMST) distributions. The package EMMIXuskew implements a closed-form expectation-maximization (EM) algorithm for computing the maximum likelihood (ML) estimates of the parameters for the (unrestricted) FM-MST model in R. EMMIXuskew also supports visualization of fitted contours in two and three dimensions, and random sample generation from a specified FM-uMST distribution.
Finite mixtures of skew t distributions have proven to be useful in modelling heterogeneous data with asymmetric and heavy tail behaviour, for example, datasets from flow cytometry. In recent years, various versions of mixtures with multivariate skew t (MST) distributions have been proposed. However, these models adopted some restricted characterizations of the component MST distributions so that the E-step of the EM algorithm can be evaluated in closed form. This paper focuses on mixtures with unrestricted MST components, and describes an iterative algorithm for the computation of the ML estimates of its model parameters. Its implementation in R is presented with the package EMMIXuskew.
The usefulness of the proposed algorithm is demonstrated in three applications to real datasets. The first example illustrates the use of the main function fmmst in the package by fitting a MST distribution to a bivariate unimodal flow cytometric sample. The second example fits a mixture of MST distributions to the Australian Institute of Sport (AIS) data, and demonstrates that EMMIXuskew can provide better clustering results than mixtures with restricted MST components. In the third example, EMMIXuskew is applied to classify cells in a trivariate flow cytometric dataset. Comparisons with some other available methods suggest that EMMIXuskew achieves a lower misclassification rate with respect to the labels given by benchmark gating analysis.
Keyword EM algorithm
Flow cytometry
Mixture models
Multivariate t distribution
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: School of Mathematics and Physics
Official 2014 Collection
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 3 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Tue, 10 Dec 2013, 01:08:13 EST by System User on behalf of School of Mathematics & Physics