Optimal Copyset in Distributed Object Storage.

IEEE BigData(2021)

引用 0|浏览3
暂无评分
摘要
In distributed storage systems, the replication mechanisms are usually used to ensure system reliability and data availability. Random replication is widely used in cloud storage systems to prevent data loss. Copyset Replication (CR) as a replication strategy, makes a nearly optimal trade-off between the number of scattered nodes and the probability of data loss. Compared with random replication, CR greatly reduces the probability of data loss caused by node failure. However, CR's random selection strategy makes it difficult to select the optimal copyset based on data characteristics such as calculation and storage. In response to this problem of CR, the Optimal Copyset Replication (OCR) proposed in this paper can select the optimal copyset according to the specified data characteristics and its corresponding node conditions. Finally, combined with Cyberspace Mimicry Defense (CMD), we implemented OCR in a distributed object storage system and conducted related experiments. When the calculation type data reaches 300,000, the experimental results prove that compared with CR randomly selecting copyset, OCR reduces the data processing time by nearly 10% through selecting the optimal copyset. By setting relevant parameters, OCR can also ensure that the data distribution of each node is relatively uniform, and avoid data skew.
更多
查看译文
关键词
Copyset Replication,optimal algorithm,data loss,Object Storage Service,Cyberspace Mimic Defense
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要