Self-taught dimensionality reduction on the high-dimensional small-sized data

Zhu, Xiaofeng, Huang, Zi, Yang, Yang, Shen, Heng Tao, Xu, Changsheng and Luo, Jiebo (2013) Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recognition, 46 1: 215-229. doi:10.1016/j.patcog.2012.07.018

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Zhu, Xiaofeng
Huang, Zi
Yang, Yang
Shen, Heng Tao
Xu, Changsheng
Luo, Jiebo
Title Self-taught dimensionality reduction on the high-dimensional small-sized data
Journal name Pattern Recognition   Check publisher's open access policy
ISSN 0031-3203
1873-5142
Publication date 2013-01
Year available 2012
Sub-type Article (original research)
DOI 10.1016/j.patcog.2012.07.018
Open Access Status
Volume 46
Issue 1
Start page 215
End page 229
Total pages 15
Place of publication Oxford, United Kingdom
Publisher Pergamon
Collection year 2013
Language eng
Abstract To build an effective dimensionality reduction model usually requires sufficient data. Otherwise, traditional dimensionality reduction methods might be less effective. However, sufficient data cannot always be guaranteed in real applications. In this paper we focus on performing unsupervised dimensionality reduction on the high-dimensional and small-sized data, in which the dimensionality of target data is high and the number of target data is small. To handle the problem, we propose a novel Self-taught Dimensionality Reduction (STDR) approach, which is able to transfer external knowledge (or information) from freely available external (or auxiliary) data to the high-dimensional and small-sized target data. The proposed STDR consists of three steps: First, the bases are learnt from sufficient external data, which might come from the same type or modality of target data. The bases are the common part between external data and target data, i.e., the external knowledge (or information). Second, target data are reconstructed by the learnt bases by proposing a novel joint graph sparse coding model, which not only provides robust reconstruction ability but also preserves the local structures amongst target data in the original space. This process transfers the external knowledge (i.e., the learnt bases) to target data. Moreover, the proposed solver to the proposed model is theoretically guaranteed that the objective function of the proposed model converges to the global optimum. After this, target data are mapped into the learnt basis space, and are sparsely represented by the bases, i.e., represented by parts of the bases. Third, the sparse features (that is, the rows with zero (or small) values) of the new representations of target data are deleted for achieving the effectiveness and the efficiency. That is, this step performs feature selection on the new representations of target data. Finally, experimental results at various types of datasets show the proposed STDR outperforms the state-of-the-art algorithms in terms of k-means clustering performance.
Keyword Dimensionality reduction
Self-taught learning
Joint sparse coding
Manifold learning
Unsupervised learning
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Available online 4 August 2012.

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2013 Collection
School of Information Technology and Electrical Engineering Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 43 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 53 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sun, 25 Nov 2012, 00:36:34 EST by System User on behalf of School of Information Technol and Elec Engineering