Practical Cross Program Memoization with KeyChain

2018 IEEE International Conference on Big Data (Big Data)（2018）

引用 0|浏览29

暂无评分

摘要

Cross program memoization (CPM) reduces resource utilization and improves response times by enabling data processing systems to re-use previously computed results between programs. An under-explored requirement to implementing CPM in general purpose data processing systems like Apache Spark is computing identifiers for results of user-defined functions that are valid between programs while avoiding degrading system performance when sharing is not possible. In this paper we describe and evaluate a technique, called KeyChain, that computes keys for intermediate and final results of programs with user-defined functions. We use KeyChain to implement CPM in Apache Spark, and show that KeyChain's simple design means it can be easily added to relevant systems, incurs low runtime overheads, and enables heuristic detection of equivalent programs so that CPM can be added to more systems and useful results can be more widely re-used.

查看译文

关键词

practical cross program memoization,CPM,resource utilization,response times,data processing systems,computed results,under-explored requirement,general purpose data,Apache Spark,user-defined functions,degrading system performance,intermediate results,final results,relevant systems,equivalent programs,useful results,KeyChain simple design

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要