Multimedia event detection using a classifier-specific intermediate representation

Ma, Zhigang, Yang, Yi, Sebe, Nicu, Zheng, Kai and Hauptmann, Alexander G. (2013) Multimedia event detection using a classifier-specific intermediate representation. IEEE Transactions On Multimedia, 15 7: 1628-1637. doi:10.1109/TMM.2013.2264928

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Ma, Zhigang
Yang, Yi
Sebe, Nicu
Zheng, Kai
Hauptmann, Alexander G.
Title Multimedia event detection using a classifier-specific intermediate representation
Journal name IEEE Transactions On Multimedia   Check publisher's open access policy
ISSN 1520-9210
Publication date 2013-11-01
Sub-type Article (original research)
DOI 10.1109/TMM.2013.2264928
Volume 15
Issue 7
Start page 1628
End page 1637
Total pages 10
Place of publication Piscataway, NJ, United States
Publisher Institute of Electrical and Electronics Engineers
Language eng
Abstract Multimedia event detection (MED) plays an important role in many applications such as video indexing and retrieval. Current event detection works mainly focus on sports and news event detection or abnormality detection in surveillance videos. Differently, our research aims to detect more complicated and generic events within a longer video sequence. In the past, researchers have proposed using intermediate concept classifiers with concept lexica to help understand the videos. Yet it is difficult to judge how many and what concepts would be sufficient for the particular video analysis task. Additionally, obtaining robust semantic concept classifiers requires a large number of positive training examples, which in turn has high human annotation cost. In this paper, we propose an approach that exploits the external concepts-based videos and event-based videos simultaneously to learn an intermediate representation from video features. Our algorithm integrates the classifier inference and latent intermediate representation into a joint framework. The joint optimization of the intermediate representation and the classifier makes them mutually beneficial and reciprocal. Effectively, the intermediate representation and the classifier are tightly correlated. The classifier dependent intermediate representation not only accurately reflects the task semantics but is also more suitable for the specific classifier. Thus we have created a discriminative semantic analysis framework based on a tightly coupled intermediate representation. Extensive experiments on multimedia event detection using real-world videos demonstrate the effectiveness of the proposed approach.
Keyword Intermediate representation
Multimedia event detection
Video Retrieval
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2014 Collection
School of Information Technology and Electrical Engineering Publications
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 21 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 31 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Sun, 24 Nov 2013, 10:25:51 EST by System User on behalf of School of Information Technol and Elec Engineering