Development of Big Data Multi-VM Platform for Rapid Prototyping of Distributed Deep Learning.

Lecture Notes in Computer Science(2018)

引用 2|浏览16
暂无评分
摘要
The present study utilizes VirtualBox virtual environment technology to develop the personal big data multi-VM platform with four-node Spark and Hadoop cluster that can effectively replicate and provide an environment for developers to easily design and implement the Spark and Hadoop Map/Reduce programming. Before running their Big Data and deep learning applications in physical multi-node Spark and Hadoop Cluster, developers can conduct Map/Reduce programing simply on the proposed multi-VM platform, which is exactly the same as the physical one. To demonstrate its capability and applicability, this study utilizes the deep learning application as an example for function illustration. In this study, the big data multi-VM platform provides the rapid prototyping of distributed deep learning by using a cutting-edge framework TensorFlowOnSpark (TFoS) for AI developers. To look into deep insight, this study performs the deep-learning benchmark in different types of cluster systems including the multi-node big data VM platform, physical standalone system and the physical small-cluster system. The results indicate that InputMode. SPARK can get 3.3 times faster than InputMode. TENSORFLOW on the big data VM platform and even achieve 6.1 times faster on the physical server.
更多
查看译文
关键词
Big data multi-VM platform,Deep learning application Spark,In-memory computing,Hadoop Map/Reduce
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要