Video Analysis Based on Learning on Special Manifolds for Visual Recognition

Shirazi, Sareh (2013). Video Analysis Based on Learning on Special Manifolds for Visual Recognition PhD Thesis, School of Information Technology and Electrical Engineering, The University of Queensland.

       
Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
s4224273_phd_submission.pdf Thesis fulltext application/pdf 11.09MB 10
Author Shirazi, Sareh
Thesis Title Video Analysis Based on Learning on Special Manifolds for Visual Recognition
School, Centre or Institute School of Information Technology and Electrical Engineering
Institution The University of Queensland
Publication date 2013
Thesis type PhD Thesis
Supervisor Brian C. Lovell
Mehrtash T. Harandi
Total pages 145
Language eng
Subjects 080106 Image Processing
010401 Applied Statistics
080104 Computer Vision
Formatted abstract
Computer vision has emerged as one of the prominent fields of research over the last few decades. It includes a wide range of applications ranging from face recognition, pedestrian detection, action recognition and tracking. However, it is a challenge to build effective systems that are able to handle occlusion, varying illumination, varying pose and other encountered factors in the practical environment. In addition, modelling video sequences by subspaces has recently shown promise for various computer vision applications due to their ability to accommodate the effects of image variations. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold.

In this work, our aim is to address three predominant tasks pertaining to many other computer vision applications such as content-based video analysis, security and surveillance, human-computer interaction, event analysis, human behaviour analysis and video retrieval. More specifically, we will address the challenging and fundamental tasks of
1. Visual recognition
2. Clustering
3. Visual tracking

To address the first task, we propose to embed Grassmann manifolds into Reproducing Kernel Hilbert Spaces (RKHS) and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we present graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability. In addition, we also develop the proposed framework over the Riemannian manifolds. Thorough experiments on face and object recognition, action recognition, texture classification and person reidentification indicate that the proposed method obtains marked improvement in discrimination accuracy in comparison to several state-of-the-art methods.

The second task addressed by this work is clustering of data lying on Grassmann manifolds which plays an essential role in data analysis. A novel clustering method is proposed by defining a measure of cluster distortion and embed the manifolds such that the distortion is minimised. Furthermore, we extend the framework for the semi-supervised scenario. We show the optimal solution is a generalised eigenvalue problem that can be solved very efficiently. We also develop the semi-supervised intrinsic Grassmann kmeans algorithm as well as extending Locally Linear Embedding (LLE) and Laplacian Eigenmaps(LE) over Grassmann manifolds. Experiments on clustering synthetic data, human action sequences, face images, social behaviour and handwritten digits, show that in comparison to well-known methods, the proposed approach obtains a significant improvement in clustering accuracy, while also being several orders of magnitude faster.

To address the third task, we propose a tracking approach based on affine subspaces. As subspaces are able to accommodate the occlusion, pose, and illumination variations which is an essential precursor to obtaining a robust visual tracking system. We furthermore propose a novel approach to measure affine subspace-to-subspace distance via the use of the non-Euclidean geometry of Grassmann manifolds. Quantitative evaluation on challenging video sequences indicates that the proposed approach obtains considerably better performance than several recent state-of-the-art methods such as Tracking- Learning-Detection and MILtrack.
Keyword Computer Vision
Machine Learning
Grassmann Manifolds
Classification
Clustering
Visual Tracking

 
Citation counts: Google Scholar Search Google Scholar
Created: Tue, 04 Mar 2014, 13:56:22 EST by Sareh Abolahrari Shirazi on behalf of Scholarly Communication and Digitisation Service