LShape Partitioning: Parallel Skyline Query Processing Using MapReduce

IEEE Transactions on Knowledge and Data Engineering(2022)

引用 0|浏览0
A skyline query searches the data points that are not dominated by others in the dataset. It is widely adopted for many applications which require multi-criteria decision making. However, skyline query processing is considerably time-consuming for a high-dimensional large scale dataset. Parallel computing techniques are therefore needed to address this challenge, among which MapReduce is one of the most popular frameworks to process big data. A great number of efficient MapReduce skyline algorithms have been proposed in the literature and most of their designs focus on partitioning and pruning the given dataset. However, there are still opportunities for further parallelism. In this study, we propose two parallel skyline processing algorithms using a novel LShape partitioning strategy and an effective Propagation Filtering method. These two algorithms are 2Phase LShape and 1Phase LShape, used for multiple reducers and single reducer, respectively. By extensive experiments, we verify that our algorithms outperformed the state-of-the-art approaches, especially for high-dimensional large scale datasets.
parallel skyline query processing,mapreduce,lshape partitioning
AI 理解论文
Chat Paper