Learning Invariances for High-Dimensional Data Analysis

Baktashmotlagh, Mahsa (2014). Learning Invariances for High-Dimensional Data Analysis PhD Thesis, School of Information Technology and Electrical Engineering, The University of Queensland. doi:10.14264/uql.2014.183

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
s4253874_phd_submission.pdf Thesis (fulltext) application/pdf 7.34MB 10

Author Baktashmotlagh, Mahsa
Thesis Title Learning Invariances for High-Dimensional Data Analysis
School, Centre or Institute School of Information Technology and Electrical Engineering
Institution The University of Queensland
DOI 10.14264/uql.2014.183
Publication date 2014
Thesis type PhD Thesis
Supervisor Brian C. Lovell
Abbas Bigdeli
Total pages 99
Language eng
Subjects 0801 Artificial Intelligence and Image Processing
0104 Statistics
Formatted abstract
Dimensionality reduction has emerged as one of the prominent fields of research since it provides a solution to a wide class of problems, such as compression, classification, regression, feature analysis, and visual recognition. Generally, subspace learning algorithms find a low-dimensional subspace from a given high-dimensional data, wherein samples from different classes can be well separated.

In the past decades, several types of dimensionality reduction and subspace selection algorithms have been widely used for visual data analysis. Conventional algorithms such as principal component analysis (PCA) and linear discriminant analysis (LDA), perform well under the assumption that training and test data follow similar distributions. However, the main drawback still remains in the sense that the distribution of the training data and that of the test data are mismatched when samples are drawn from different but related sources. They also fail to incorporate temporal statistical variations in data distribution when applied to time-series data(videos). Consequently they result in unsatisfactory recognition performance when they deal with real world visual analysis problems.

To overcome the above issues, in this research, our aim is to consider invariance and stationarity in subspace analysis pertaining to many computer vision applications such as video retrieval, human behaviour analysis, event analysis and activity recognition. More specifically, we will address the challenging and fundamental tasks of
1. Video Classification (Scene, dynamic texture and action recognition)
2. Visual Domain Adaptation.

To address the first task, we propose a subspace learning approach that focuses on extracting the stationary parts of all videos in the same class. The notion of stationarity is intuitively well-adapted to model the temporal nature of the video signal and lets us make use of many image features. Instead of modeling temporal information in the features, our method explicitly accounts for it when learning the latent space. As a consequence, the resulting video representation is particularly well-suited for classification purpose. Our experimental evaluation shows that our approach outperforms baselines from the different groups of methods in several video classification tasks such as dynamic texture recognition, scene classification and action recognition.

The second task addressed by this work is domain adaptation for visual recognition. In our proposed approach, we follow the intuitive idea of learning the domain invariant subspace by matching the distributions of the transformed training and test data using Maximum Mean Discrepancy (MMD). This, we believe, makes better use of the expressive power of the kernel in MMD compared to the other approaches like Transfer Component Analysis (TCA). Although motivated by MMD, in TCA, the distance between the sample means is measured in a lower-dimensional space rather than in Reproducing Kernel Hilbert Space (RKHS), which somewhat contradicts the intuition behind the use of kernels. Furthermore, we extend the framework for the semi-supervised scenario. Experiments on benchmark domain adaptation dataset for visual recognition, show that in comparison to well-known methods, the proposed approach obtains a significant improvement in classification accuracy.

To address the second task, we propose another domain adaptation method that exploits the Riemannian structure of the statistical manifold when learning the invariant subspaces or samples. To this end, we introduce the use of the Hellinger distance which is related to the geodesic distance on the space of probability distributions. While the Hellinger distance has been employed for dimensionality reduction, to the best of our knowledge, our approach is the first attempt at exploiting the Riemannian geometry of the statistical manifold for domain adaptation. We showed that our sample-selection and subspace-based approaches in conjunction with Hellinger distance for distribution matching consistently outperform the similar approaches on the tasks of visual domain adaptation and WiFi localization.
Keyword Machine learning
Computer vision
Time series analysis
Domain adaptation
Grassmann manifolds
Video classification

Citation counts: Google Scholar Search Google Scholar
Created: Fri, 27 Jun 2014, 13:19:53 EST by Mahsa Baktashmotlagh on behalf of Scholarly Communication and Digitisation Service