Robust hashing with local models for approximate similarity search

Song, Jingkuan, Yang, Yi, Li, Xuelong, Huang, Zi and Yang, Yang (2014) Robust hashing with local models for approximate similarity search. IEEE Transactions on Cybernetics, 44 7: 1225-1236. doi:10.1109/TCYB.2013.2289351


Author Song, Jingkuan
Yang, Yi
Li, Xuelong
Huang, Zi
Yang, Yang
Title Robust hashing with local models for approximate similarity search
Journal name IEEE Transactions on Cybernetics   Check publisher's open access policy
ISSN 2168-2267
2168-2275
Publication date 2014-07
Year available 2014
Sub-type Article (original research)
DOI 10.1109/TCYB.2013.2289351
Open Access Status
Volume 44
Issue 7
Start page 1225
End page 1236
Total pages 12
Place of publication Piscataway NJ United States
Publisher Institute of Electrical and Electronics Engineers
Collection year 2015
Language eng
Formatted abstract
Similarity search plays an important role in many applications involving high-dimensional data. Due to the known dimensionality curse, the performance of most existing indexing structures degrades quickly as the feature dimensionality increases. Hashing methods, such as locality sensitive hashing (LSH) and its variants, have been widely used to achieve fast approximate similarity search by trading search quality for efficiency. However, most existing hashing methods make use of randomized algorithms to generate hash codes without considering the specific structural information in the data. In this paper, we propose a novel hashing method, namely, robust hashing with local models (RHLM), which learns a set of robust hash functions to map the high-dimensional data points into binary hash codes by effectively utilizing local structural information. In RHLM, for each individual data point in the training dataset, a local hashing model is learned and used to predict the hash codes of its neighboring data points. The local models from all the data points are globally aligned so that an optimal hash code can be assigned to each data point. After obtaining the hash codes of all the training data points, we design a robust method by employing ℓ2,1-norm minimization on the loss function to learn effective hash functions, which are then used to map each database point into its hash code. Given a query data point, the search process first maps it into the query hash code by the hash functions and then explores the buckets, which have similar hash codes to the query hash code. Extensive experimental results conducted on real-life datasets show that the proposed RHLM outperforms the state-of-the-art methods in terms of search quality and efficiency.
Keyword Approximate similarity search
Indexing
Robust hashing
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2015 Collection
School of Information Technology and Electrical Engineering Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 13 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 18 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Tue, 15 Jul 2014, 01:57:54 EST by System User on behalf of School of Information Technol and Elec Engineering