Efficient data persistence and data division for distributed computing in cloud data center networks

JOURNAL OF SUPERCOMPUTING(2023)

引用 0|浏览15
暂无评分
摘要
Container-based Hadoop distributed file system (HDFS) storage has been widely used in cloud data center networks, while traditional HDFS has single point problem resulting in overall unavailability. In this paper, we mainly study the storage reliability of the Docker container-based HDFS cluster with single point of failure. Firstly, we investigate a data volume-based persistence solution of Hadoop with the single point failure and single backup strategy of HDFS cluster. Secondly, we propose an HDFS-based replica placement algorithm for data storage with considering the performance of the host and container nodes. Thirdly, we design the KADC-KNN data segmentation algorithm to effectively store the persistent data of the Docker container. Extensive experimental results show that this method can effectively ensure the stable storage and fast migration of cluster data. Compared with the most advanced algorithm, the proposed data volume persistence algorithm DVPS can improve the data reliability by 19.8%. The data partitioning algorithm KADC-KNN improves the partitioning accuracy by 20.2% and has lower time overhead.
更多
查看译文
关键词
Hadoop,Data persistence,Data storage,Federated distributed file storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要