Provenance-aware versioned dataworkspaces

TaPP(2016)

引用 23|浏览49
暂无评分
摘要
Data preparation, curation, and analysis tasks are often exploratory in nature, with analysts incrementally designing workflows that transform, validate, and visualize their input sources. This requires frequent adjustments to data and workflows. Unfortunately, in current data management systems, even small changes can require time- and resource-heavy operations like materialization, manual version management, and re-execution. This added overhead discourages exploration. We present Provenance-aware Versioned Dataworkspaces (PVDs), our vision of a sandboxed environment in which users can apply -- and more importantly, easily undo -- changes to their data and workflows. A PVD keeps a log of the user's operations in a light-weight version graph structure. We describe a model for PVDs that admits efficient automatic refresh, merging of histories, reenactment, and automated conflict resolution. We also highlight the conceptual and technical challenges that need to be overcome to create a practical PVD.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要