A framework for ranking and KNN queries in a probabilistic skyline model

Li, Jianguo, Fung, Gabriel Pui Cheong, Zhou, Wei and Huang, Weiping (2015) A framework for ranking and KNN queries in a probabilistic skyline model. Journal of Computational Information Systems, 11 6: 2057-2084. doi:10.12733/jcis13726

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads

Author Li, Jianguo
Fung, Gabriel Pui Cheong
Zhou, Wei
Huang, Weiping
Title A framework for ranking and KNN queries in a probabilistic skyline model
Journal name Journal of Computational Information Systems
ISSN 2374-7447
1553-9105
Publication date 2015-03-15
Sub-type Article (original research)
DOI 10.12733/jcis13726
Open Access Status Not yet assessed
Volume 11
Issue 6
Start page 2057
End page 2084
Total pages 28
Place of publication Norwalk, CT, United States
Publisher Binary Information Press
Collection year 2016
Language eng
Abstract Skyline computation has gained a lot of attention in recent years. According to the definition of skyline, objects that belong to skyline cannot be ranked among themselves because they are incomparable. This constraint limits the application of skyline. Fortunately, due to the recently proposed probabilistic skyline model, skyline objects which contain multiple elements, can now be compared with each others. Different from the traditional skyline model where each object can either be a skyline object or not, in the probabilistic skyline model, each object is assigned a skyline probability to denote its likelihood of being a skyline object. Under this model, two simple but important questions will naturally be asked: (1) Given an object, which of the objects are the K nearest neighbors to it based on their skyline probabilities? (2) Given an object, what is the ranking of the objects which have skyline probabilities greater than the given object? To the best of our knowledge, no existing work can effectively answer these two questions. Yet, answering them is not trivial. For a medium-size dataset (e.g. 10,000 objects), it may take more than an hour to compute the skyline probabilities of all objects. In this paper, we propose a novel framework to answering the above two questions on the fly efficiently. Our proposed work is based on the idea of bounding-pruning-refining strategy. We first compute the skyline probabilities of the target object and all its elements. For the rest of the objects, instead of computing their accurate skyline probabilities, we compute the upper bound and lower bound skyline probabilities using the elements of the target object. Based on lower bound and upper bound of their skyline probabilities, some objects, which cannot be in the result, will be pruned. For those objects, which we are unknown whether they are in the results or not, we need to refine their bounds. The refinement strategy is based on the idea of space partition. Specifically, we first partition the whole dataspace into several subspaces based on the distribution of elements in the target object. When we iteratively do the the refinement of the bounds, we will do the partitioning strategy in each subspace. In order to implement this framework, a novel tree, called Space Partition Tree (SPTree) is proposed to index the objects and their elements. We evaluate our proposed work using three synthetic datasets and one real-life dataset. We report all our findings in this paper.
Keyword Bounding-pruning-refining
K nearest neighbors
p-Dominant
Probabilistic skyline
Ranking
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes http://www.jofcis.com/

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2016 Collection
School of Information Technology and Electrical Engineering Publications
 
Versions
Version Filter Type
Citation counts: Scopus Citation Count Cited 0 times in Scopus Article
Google Scholar Search Google Scholar
Created: Tue, 09 Jun 2015, 01:46:37 EST by System User on behalf of Scholarly Communication and Digitisation Service