Efficient histogram-based similarity search in ultra-high dimensional space

Liu, Jiajun, Huang, Zi, Shen, Heng Tao and Zhou, Xiaofang (2011). Efficient histogram-based similarity search in ultra-high dimensional space. In: Database Systems for Advanced Applications: 16th International Conference Proceedings: Part 2. 16th International Conference on Database Systems for Advanced Applications, DASFAA 2011, Hong Kong, China, (1-15). 22-25 April 2011. doi:10.1007/978-3-642-20152-3_1

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Liu, Jiajun
Huang, Zi
Shen, Heng Tao
Zhou, Xiaofang
Title of paper Efficient histogram-based similarity search in ultra-high dimensional space
Conference name 16th International Conference on Database Systems for Advanced Applications, DASFAA 2011
Conference location Hong Kong, China
Conference dates 22-25 April 2011
Proceedings title Database Systems for Advanced Applications: 16th International Conference Proceedings: Part 2   Check publisher's open access policy
Journal name Lecture Notes in Computer Science   Check publisher's open access policy
Place of Publication Heidelberg, Germany
Publisher Springer
Publication Year 2011
Sub-type Fully published paper
DOI 10.1007/978-3-642-20152-3_1
Open Access Status
ISBN 9783642201516
ISSN 0302-9743
1611-3349
Volume 6588
Start page 1
End page 15
Total pages 15
Collection year 2012
Language eng
Abstract/Summary Recent development in image content analysis has shown that the dimensionality of an image feature can reach thousands or more for satisfactory results in some applications such as face recognition. Although high-dimensional indexing has been extensively studied in database literature, most existing methods are tested for feature spaces with less than hundreds of dimensions and their performance degrades quickly as dimensionality increases. Given the huge popularity of histogram features in representing image content, in this papers we propose a novel indexing structure for efficient histogram based similarity search in ultra-high dimensional space which is also sparse. Observing that all possible histogram values in a domain form a finite set of discrete states, we leverage the time and space efficiency of inverted file. Our new structure, named two-tier inverted file, indexes the data space in two levels, where the first level represents the list of occurring states for each individual dimension, and the second level represents the list of occurring images for each state. In the query process, candidates can be quickly identified with a simple weighted state-voting scheme before their actual distances to the query are computed. To further enrich the discriminative power of inverted file, an effective state expansion method is also introduced by taking neighbor dimensions’ information into consideration. Our extensive experimental results on real-life face datasets with 15,488 dimensional histogram features demonstrate the high accuracy and the great performance improvement of our proposal over existing methods.
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 1 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Wed, 01 Jun 2011, 10:46:29 EST by Ms Dulcie Stewart on behalf of School of Information Technol and Elec Engineering