Using evidence of mixed populations to select variables for clustering very high-dimensional data

Chan, Yao-ban and Hall, Peter (2010) Using evidence of mixed populations to select variables for clustering very high-dimensional data. Journal of the American Statistical Association, 105 490: 798-809. doi:10.1198/jasa.2010.tm09404


Author Chan, Yao-ban
Hall, Peter
Title Using evidence of mixed populations to select variables for clustering very high-dimensional data
Journal name Journal of the American Statistical Association   Check publisher's open access policy
ISSN 0162-1459
1537-274X
Publication date 2010-06
Year available 2010
Sub-type Article (original research)
DOI 10.1198/jasa.2010.tm09404
Volume 105
Issue 490
Start page 798
End page 809
Total pages 12
Place of publication Alexandria, VA United States
Publisher American Statistical Association
Collection year 2011
Language eng
Formatted abstract
In this paper we develop a nonparametric approach to clustering very high-dimensional data, designed particularly for problems where the mixture nature of a population is expressed through multimodality of its density. Therefore, a technique based implicitly on mode testing can be particularly effective. In principle, several alternative approaches could be used to assess the extent of multimodality, but in the present problem the excess mass method has important advantages. We show that the resulting methodology for determining clusters is particularly effective in cases where the data are relatively heavy tailed or show a moderate to high degree of correlation, or when the number of important components is relatively small. Conversely, in the case of light-tailed, almost-independent components when there are many clusters, clustering in terms of modality can be less reliable than more conventional approaches. This article has supplementary material online.
Keyword Bandwidth test
Bootstrap
Density estimation
Excess mass
Mode test
Multimodality
Q-Index Code C1
Q-Index Status Provisional Code
Institutional Status Non-UQ

Document type: Journal Article
Sub-type: Article (original research)
Collection: School of Mathematics and Physics
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 5 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 8 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Thu, 12 Sep 2013, 14:49:58 EST by Kay Mackie on behalf of School of Mathematics & Physics