Demonstration of Geyser: Provenance Extraction and Applications over Data Science Scripts

SIGMOD/PODS '23: Companion of the 2023 International Conference on Management of Data(2023)

引用 0|浏览57
暂无评分
摘要
As enterprises have started developing and deploying complicated data science workloads at scale, the need for mechanisms that enable enterprise-grade data science (e.g., compliance or auditing) has become more pronounced. In this paper, we present Geyser, an extensible provenance system for data science workloads that can be used as a foundation for enterprise-grade data science. Our system supports both static and dynamic provenance, over a wide range of data science scripts, driven by a knowledge base of data science APIs. We demonstrate the wide applicability of the system using various industrial applications: provenance extraction, model compliance, model linting, model versioning, and poisoning detection. A video of the demonstration is available at https://aka.ms/geyserdemo.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要