Towards Building A Scalable Data Analytics System On Clouds: An Early Experience On Alicloud

PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD)(2018)

引用 2|浏览87
暂无评分
摘要
With the development of big data, big data processing systems, such as Hadoop and Spark, are widely used to handle large-scale data. To avoid the complexity and expensiveness of building a self-owned big data processing system, cloud providers tend to deploy big data processing tools as cloud services. Typical examples include Amazon EMR, Azure HDInsight and AliCloud E-MapReduce. However, how to build a cost-efficient system and scale the system is still challenging. In this paper, we have conducted a case study on AliCloud E-MapReduce, and analyzed the system performance upon local and remote file systems. We compared the scalability of Hadoop and Spark by using scaleout and scale-up strategies respectively. Based on the analysis results, we derive several observations and implications, which will contribute to guide the performance optimization.
更多
查看译文
关键词
scalability evaluation, cloud-based data processing, SaaS
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要