Chrome Extension
WeChat Mini Program
Use on ChatGLM

Optimizing Data Migration Using Online Clustering.

Jingfeng Pan, Yunfei Peng, Kaiyu Li,Aijun An,Xiaohui Yu, Dariusz Jania

CASCON '23: Proceedings of the 33rd Annual International Conference on Computer Science and Software Engineering(2023)

Cited 0|Views16
No score
Abstract
Data migration refers to the transfer of data from one location to another, for instance, from a local database to a cloud server or from one cloud to another. To minimize business disruption during this process, it is essential to ensure that data migration has a high throughput. However, current methods only directly compress data into smaller files and transfer them over the network without exploiting data distribution to increase the compression ratio further, ultimately resulting in low overall throughput. In this paper, we present a three-step approach to improve the data migration throughput for relational databases. The proposed approach involves clustering the records into groups, compressing each group, and transmitting the compressed files via the network. By clustering similar records together, the compression ratio within each group is increased, resulting in overall higher compression and lower network transmission time. If the used clustering time is less than the reduced network transmission time, the clustering is worthwhile for the data migration task. We propose to use an online 𝑘-prototype clustering method and a workload-balancing strategy. The experiments conducted on benchmark datasets reveal that our proposed method attains a 4% enhancement in compression ratio and over 3% improvement in throughput as compared to the baseline approach on average
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined