BM-Store: A Transparent and High-performance Local Storage Architecture for Bare-metal Clouds Enabling Large-scale Deployment.

HPCA(2023)

引用 0|浏览35
暂无评分
摘要
Bare-metal instances are crucial for high-value, mission-critical applications on the cloud. Tenants exclusively use these dedicated hardware resources. Local virtualized disks are essential for bare-metal instances to provide flexible and high-performance storage resources. Traditionally tenants can choose polling-based software virtualization techniques, but they consume too many valuable host CPU cores and suffer from performance degradation. Cloud vendors are hard to deploy existing hardware-assisted local storage solutions in bare-metal instances due to no access to the host OS to install customized drivers. Moreover, cloud vendors have difficulties managing and maintaining the local storage devices in bare-metal instances because hardware resources and host operating systems are completely utilized by tenants, then it will impact the availability of storage devices.This paper presents our design and experience with BM-Store, a novel high-performance hardware-assisted virtual local storage architecture for bare-metal clouds. BM-Store is transparent to the host that tenants are unaware of the underlying hardware architecture. Therefore, it can be deployed on a large scale in cloud vendors. BM-Store consists of two components: an FPGA-based BMS-Engine and an ARM-based BMS-Controller. The BMS-Engine accelerates the I/O path to enable high-performance virtual storage independent of disk devices without consuming any CPU resource on the host. The BMS-Controller is responsible for resource management and maintenance to achieve flexible and high available local storage. The results of the extensive experiments show that BM-Store can achieve near-native performance, which only introduces about 3 µs extra latency and average 4.0% throughput overhead to native disks. Compared to SPDK vhost, BM-Store achieves an average bandwidth improvement of 15.7% in microbenchmark and a maximum throughput enhancement of 13.4% in real-world applications.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要