谷歌Chrome浏览器插件
订阅小程序
在清言上使用

Characterizing Disk Health Degradation and Proactively Protecting Against Disk Failures for Reliable Storage Systems

2019 IEEE International Conference on Autonomic Computing (ICAC)(2019)

引用 12|浏览13
暂无评分
摘要
The booming of cloud computing, online services and big data applications have resulted in dramatic expansion of storage systems. Meanwhile, disk drives are reported to be the most commonly replaced hardware component. Disk failures cause service downtime and even data loss, costing enterprises multi-trillion dollars per year. Existing disk failure management approaches are mostly reactive and incur high overheads. To overcome these problems, in this paper, we present a proactive, cost-effective solution to managing large-scale production storage systems. We aim to uncover the entire process in which disk's health deteriorates and forecast when disk drives will fail in the future. Due to a common lack of diagnostic information of disk failures, we rely on the Self-Monitoring, Analysis and Reporting Technology (SMART) data and explore statistical analysis techniques to identify the start of disk degradation. We then model the disk degradation processes as functions of SMART attributes, which eliminates the dependency on time and thus I/O workload. Experimental results from over 23,000 enterprise-class disk drives in a production data center show that our derived models can accurately quantify the degradation of disk health, which enables us to proactively protect data against disk failures. We also investigate several types of disk failures and propose remediation mechanisms to prolong disk lifetime.
更多
查看译文
关键词
storage reliability,disk failures,health degradation,proactive prediction,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要