Provenance in Context of Hadoop as a Service (HaaS) - State of the Art and Research Directions

2017 IEEE 10th International Conference on Cloud Computing (CLOUD)(2017)

引用 4|浏览24
暂无评分
摘要
Hadoop as a service (HaaS), also known as Hadoop in the cloud, is a big data analytics framework that stores and analyzes data in the cloud using Hadoop/Spark. In this paper, we discuss the importance of providing provenance capabilities in context of Hadoop as a service (HaaS) framework. We first review the state of the art in provenance tracking in context of databases and work-flow processing, in context of cloud and in context of big data analytics frameworks like Hadoop and Spark. We next identify a number of provenance capabilities which have been developed in context of databases and workflow processing but the corresponding solutions have not been developed in context of Hadoop or Spark. We argue that developing these solutions is important so that a comprehensive provenance aware Hadoop as a Service (HaaS) can be provided on cloud. The paper ends by identifying some research challenges in developing these provenance capabilities.
更多
查看译文
关键词
Hadoop,Provenance,Spark,Blockchain
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要