NL2Formula: Generating Spreadsheet Formulas from Natural Language Queries
Conference of the European Chapter of the Association for Computational Linguistics(2024)
Abstract
Writing formulas on spreadsheets, such as Microsoft Excel and Google Sheets,
is a widespread practice among users performing data analysis. However,
crafting formulas on spreadsheets remains a tedious and error-prone task for
many end-users, particularly when dealing with complex operations. To alleviate
the burden associated with writing spreadsheet formulas, this paper introduces
a novel benchmark task called NL2Formula, with the aim to generate executable
formulas that are grounded on a spreadsheet table, given a Natural Language
(NL) query as input. To accomplish this, we construct a comprehensive dataset
consisting of 70,799 paired NL queries and corresponding spreadsheet formulas,
covering 21,670 tables and 37 types of formula functions. We realize the
NL2Formula task by providing a sequence-to-sequence baseline implementation
called fCoder. Experimental results validate the effectiveness of fCoder,
demonstrating its superior performance compared to the baseline models.
Furthermore, we also compare fCoder with an initial GPT-3.5 model (i.e.,
text-davinci-003). Lastly, through in-depth error analysis, we identify
potential challenges in the NL2Formula task and advocate for further
investigation.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined