Building and Evaluation of Cloud Storage and Datasets Services on AI and HPC Converged Infrastructure

2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)(2020)

引用 4|浏览16
暂无评分
摘要
AI Bridging Cloud Infrastructure (ABCI) is a world-leading open AI computing infrastructure, for accelerating R&D activities of artificial intelligence. In order to share and reuse AI software assets with ease, ABCI supports container-based application deployment and fine-grained resource allocation on top of the conventional HPC architecture, and provides tens of peta-bytes of high performance storage. One of the on-going major challenges in ABCI, however, is to more efficiently and flexibly exchange and share machine learning models and data related to AI, with other services deployed outside of ABCI in the real world. Our new services called as ABCI Cloud Storage and ABCI Public Datasets are designed for tackling the challenge and taking a role of "Data Harbor" of ABCI. The services allow users to store input and output data of jobs to be run on the ABCI compute nodes, and to share them with not only ABCI users but also non-ABCI users. This paper presents our design and integration of the services to conventional HPC architecture, as a case of ABCI, and reports performance evaluation of them. Based on our attempt and experience, the paper finally summarizes discussion about future direction of the S3 based front data/storage service of the AI and HPC converged system.
更多
查看译文
关键词
cloud storage, S3, AI & HPC converged system, dataset sharing, object storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要