Clustering of high-dimensional data via finite mixture models

McLachlan, Geoff J. and Baek, Jangsun (2010). Clustering of high-dimensional data via finite mixture models. In Andreas Fink, Berthold Lausen, Wilfried Seidel and Alfred Ultsch (Ed.), Advances in Data Analysis, Business Intelligence: Proceedings of the 32nd Annual Conference of the Gesellschaft für Klassifikation e.V., Joint Conference with the British Classification Society (BCS) and the Dutch/Flemish Classification Society (VOC Helmut-Schmidt-University, Hamburg, July 16–18, 2008 (pp. 33-44) Heidelberg, Germany: Springer-Verlag. doi:10.1007/978-3-642-01044-6


Author McLachlan, Geoff J.
Baek, Jangsun
Title of chapter Clustering of high-dimensional data via finite mixture models
Title of book Advances in Data Analysis, Business Intelligence: Proceedings of the 32nd Annual Conference of the Gesellschaft für Klassifikation e.V., Joint Conference with the British Classification Society (BCS) and the Dutch/Flemish Classification Society (VOC Helmut-Schmidt-University, Hamburg, July 16–18, 2008
Place of Publication Heidelberg, Germany
Publisher Springer-Verlag
Publication Year 2010
Sub-type Research book chapter (original research)
DOI 10.1007/978-3-642-01044-6
Series Studies in Classification, Data Analysis, and Knowledge Organization
ISBN 9783642010439
9783642010446
Editor Andreas Fink
Berthold Lausen
Wilfried Seidel
Alfred Ultsch
Chapter number 1.3
Start page 33
End page 44
Total pages 12
Total chapters 12
Collection year 2011
Language eng
Abstract/Summary Finite mixture models are being commonly used in a wide range of applications in practice concerning density estimation and clustering. An attractive feature of this approach to clustering is that it provides a sound statistical framework in which to assess the important question of how many clusters there are in the data and their validity. We review the application of normal mixture models to high-dimensional data of a continuous nature. One way to handle the fitting of normalmixture models is to adopt mixtures of factor analyzers. They enable modelbased density estimation and clustering to be undertaken for high-dimensional data, where the number of observations n is not very large relative to their dimension p. In practice, there is often the need to reduce further the number of parameters in the specification of the component-covariancematrices. We focus here on a new modified approach that uses common component-factor loadings, which considerably reduces further the number of parameters. Moreover, it allows the data to be displayed in low-dimensional plots.
Keyword Common factor analyzers
Mixtures of factor analyzers
Model-based clustering
Normal mixture densities
Q-Index Code B1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Proceedings of the 32nd Annual Conference of the Gesellschaft für Klassifikation e.V., Joint Conference with the British Classification Society (BCS) and the Dutch/Flemish Classification Society (VOC), Helmut-Schmidt-University, Hamburg, July 16-18, 2008.

Document type: Book Chapter
Collections: School of Mathematics and Physics
Official 2011 Collection
 
Versions
Version Filter Type
Citation counts: Google Scholar Search Google Scholar
Created: Fri, 18 Feb 2011, 16:48:03 EST by Kay Mackie on behalf of School of Mathematics & Physics