Multiple features but few labels? A symbiotic solution exemplified for video analysis

Ma, Zhigang, Sebe, Nicu, Yang, Yi and Hauptmann, Alexander G. (2014). Multiple features but few labels? A symbiotic solution exemplified for video analysis. In: Kien A. Hua, Proceedings of the ACM International Conference on Multimedia. 22nd ACM International Conference on Multimedia, MM 2014, Orlando, FL United States, (77-86). 3-7 November 2014. doi:10.1145/2647868.2654907

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Ma, Zhigang
Sebe, Nicu
Yang, Yi
Hauptmann, Alexander G.
Title of paper Multiple features but few labels? A symbiotic solution exemplified for video analysis
Conference name 22nd ACM International Conference on Multimedia, MM 2014
Conference location Orlando, FL United States
Conference dates 3-7 November 2014
Proceedings title Proceedings of the ACM International Conference on Multimedia
Series MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia
Place of Publication New York, NY United States
Publisher Association for Computing Machinery, Inc
Publication Year 2014
Sub-type Fully published paper
DOI 10.1145/2647868.2654907
Open Access Status
ISBN 9781450330633
Editor Kien A. Hua
Start page 77
End page 86
Total pages 10
Collection year 2015
Language eng
Formatted Abstract/Summary
Video analysis has been attracting increasing research due to the proliferation of internet videos. In this paper, we investigate how to improve the performance on internet quality video analysis. Particularly, we work on the scenario of few labeled training videos being provided, which is less focused in multimedia. To being with, we consider how to more effectively harness the evidences from the low-level features. Researchers have developed several promising features to represent videos to capture the semantic information. However, as videos usually characterize rich semantic contents, the analysis performance by using one single feature is potentially limited. Simply combining multiple features through early fusion or late fusion to incorporate more informative cues is doable but not optimal due to the heterogeneity and different predicting capability of these features. For better exploitation of multiple features, we propose to mine the importance of different features and cast it into the learning of the classification model. Our method is based on multiple graphs from different features and uses the Riemannian metric to evaluate the feature importance. On the other hand, to be able to use limited labeled training videos for a respectable accuracy we formulate our method in a semi-supervised way. The main contribution of this paper is a novel scheme of evaluating the feature importance that is further casted into a unified framework of harnessing multiple weighted features with limited labeled training videos. We perform extensive experiments on video action recognition and multimedia event recognition and the comparison to other state-of-the-art multi-feature learning algorithms has validated the efficacy of our framework.
Keyword Multi feature
Riemannian distance
Semisupervised learning
Video analysis
Weighted features
Q-Index Code E1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Conference Paper
Collections: Official 2015 Collection
Official Audit
School of Medicine Publications
Version Filter Type
Citation counts: Scopus Citation Count Cited 2 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 16 Dec 2014, 03:51:48 EST by System User on behalf of School of Medicine