On Realizing Efficient Deep Learning Using Serverless Computing

2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)(2022)

引用 3|浏览27
暂无评分
摘要
Serverless computing is gaining rapid popularity as it enables quick application deployment and seamless application scaling without managing complex computing resources. Re-cently, it has been explored for running data-intensive, e.g., deep learning (DL), workloads for improving application performance and reducing execution cost. However, serverless computing imposes resource-level constraints, specifically fixed memory allocation and short task timeouts, that lead to job failures. In this paper, we address these constraints and develop an effective runtime framework, DiSDeL, that improves the performance of DL jobs by leveraging data splitting techniques, and ensuring that an appropriate amount of memory is allocated to containers for storing application data and a suitable timeout is selected for each job based on its complexity in serverless deployments. We implement our approach using Apache OpenWhisk and TensorFlow platforms and evaluate it using representative DL workloads to show that it eliminates DL job failures and reduces action memory consumption and total training time by up to 44% and 46%, respectively as compared to a default serverless computing framework. Our evaluation also shows that DiSDeL achieves a performance improvement of up to 29% as compared to bare-metal TensorFlow environment in a multi-tenant setting.
更多
查看译文
关键词
Data-intensive Computing,Serverless Computing,Deep Learning,Data Parallelism,OpenWhisk,TensorFlow
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要