Reliability of Centralized vs. Parallel Software Models for Composable Storage Systems

2021 IEEE 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2021)(2021)

Cited 0|Views7
No score
Abstract
Modern storage systems consist of many hardware and software components. The core of these systems are server drawers containing data, where at least one of such drawers consists of parity (a special case is two mirrored drawers). We analyze the failure rate of two such systems both based on hyperconverged architectures: one centralized, in which the drawers share the metadata server, and one parallel, in which each drawer has its own metadata server. Inherently the parallel systems will have greater reliability. However, the new CXL and Gen-Z architectures are enabling a centralized approach where resources from multiple servers are combined to make a single virtual server. In this paper we analyze what techniques can make the probability of failure of the centralized approach approximate the probability of failure of the parallel approach. We identified the probability of Dual In-Line Memory Modules (DIMMs) failure as the key differentiator between the probability of failure of the centralized and parallel systems, and we suggest methods to compensate for DIMMs with high probability of failure.
More
Translated text
Key words
Hyperconverged architectures,hyper-converged infrastructure (HCI),cloud applications,DIMM failure rate,metadata server,composable systems
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined