谷歌Chrome浏览器插件
订阅小程序
在清言上使用

LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration

Tavor Lipman, Tova Milo, Amit Somech, Tomer Wolfson, Oz Zafar

CoRR(2024)

引用 0|浏览0
暂无评分
摘要
Data exploration is a challenging process in which users examine a dataset by iteratively employing a series of queries. While in some cases the user explores a new dataset to become familiar with it, more often, the exploration process is conducted with a specific analysis goal or question in mind. To assist users in exploring a new dataset, Automated Data Exploration (ADE) systems have been devised in previous work. These systems aim to auto-generate a full exploration session, containing a sequence of queries that showcase interesting elements of the data. However, existing ADE systems are often constrained by a predefined objective function, thus always generating the same session for a given dataset. Therefore, their effectiveness in goal-oriented exploration, in which users need to answer specific questions about the data, are extremely limited. To this end, this paper presents LINX, a generative system augmented with a natural language interface for goal-oriented ADE. Given an input dataset and an analytical goal described in natural language, LINX generates a personalized exploratory session that is relevant to the user's goal. LINX utilizes a Large Language Model (LLM) to interpret the input analysis goal, and then derive a set of specifications for the desired output exploration session. These specifications are then transferred to a novel, modular ADE engine based on Constrained Deep Reinforcement Learning (CDRL), which can adapt its output according to the specified instructions. To validate LINX's effectiveness, we introduce a new benchmark dataset for goal-oriented exploration and conduct an extensive user study. Our analysis underscores LINX's superior capability in producing exploratory notebooks that are significantly more relevant and beneficial than those generated by existing solutions, including ChatGPT, goal-agnostic ADE, and commercial systems.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要