Linear cross-modal hashing for efficient multimedia search

Zhu, Xiaofeng, Huang, Zi, Shen, Heng Tao and Zhao, Xin (2013). Linear cross-modal hashing for efficient multimedia search. In: MM '13: Proceedings of the 2013 ACM Multimedia Conference. ACM Multimedia 2013: The 21st ACM International Conference on Multimedia, Barcelona, Spain, (143-152). 21-25 October, 2013. doi:10.1145/2502081.2502107

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Zhu, Xiaofeng
Huang, Zi
Shen, Heng Tao
Zhao, Xin
Title of paper Linear cross-modal hashing for efficient multimedia search
Conference name ACM Multimedia 2013: The 21st ACM International Conference on Multimedia
Conference location Barcelona, Spain
Conference dates 21-25 October, 2013
Proceedings title MM '13: Proceedings of the 2013 ACM Multimedia Conference
Journal name MM 2013 - Proceedings of the 2013 ACM Multimedia Conference
Place of Publication New York, NY, United States
Publisher The Association for Computing Machinery (ACM)
Publication Year 2013
Sub-type Fully published paper
DOI 10.1145/2502081.2502107
Open Access Status
ISBN 9781450324045
Start page 143
End page 152
Total pages 10
Collection year 2014
Language eng
Formatted Abstract/Summary
Most existing cross-modal hashing methods suffer from the scalability issue in the training phase. In this paper, we propose a novel cross-modal hashing approach with a linear time complexity to the training data size, to enable scalable indexing for multimedia search across multiple modals. Taking both the intra-similarity in each modal and the inter-similarity across different modals into consideration, the proposed approach aims at effectively learning hash functions from large-scale training datasets. More specifically, for each modal, we first partition the training data into 𝑘 clusters and then represent each training data point with its distances to 𝑘 centroids of the clusters. Interestingly, such a 𝑘-dimensional data representation can reduce the time complexity of the training phase from traditional 𝑂(𝑛2) or higher to 𝑂(𝑛), where 𝑛 is the training data size, leading to practical learning on large-scale datasets. We further prove that this new representation preserves the intra-similarity in each modal. To preserve the inter-similarity among data points across different modals, we transform the derived data representations into a common binary subspace in which binary codes from all the modals are "consistent" and comparable. The transformation simultaneously outputs the hash functions for all modals, which are used to convert unseen data into binary codes. Given a query of one modal, it is first mapped into the binary codes using the modal's hash functions, followed by matching the database binary codes of any other modals. Experimental results on two benchmark datasets confirm the scalability and the effectiveness of the proposed approach in comparison with the state of the art.
Keyword Cross-modal
Hashing
Index
Multimedia search
Q-Index Code E1
Q-Index Status Confirmed Code
Institutional Status UQ

 
Versions
Version Filter Type
Citation counts: Scopus Citation Count Cited 45 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Fri, 22 Nov 2013, 09:43:19 EST by Dr Heng Tao Shen on behalf of School of Information Technol and Elec Engineering