Automating the Enterprise with Foundation Models
CoRR(2024)
Abstract
Automating enterprise workflows could unlock $4 trillion/year in productivity
gains. Despite being of interest to the data management community for decades,
the ultimate vision of end-to-end workflow automation has remained elusive.
Current solutions rely on process mining and robotic process automation (RPA),
in which a bot is hard-coded to follow a set of predefined rules for completing
a workflow. Through case studies of a hospital and large B2B enterprise, we
find that the adoption of RPA has been inhibited by high set-up costs (12-18
months), unreliable execution (60% initial accuracy), and burdensome
maintenance (requiring multiple FTEs). Multimodal foundation models (FMs) such
as GPT-4 offer a promising new approach for end-to-end workflow automation
given their generalized reasoning and planning abilities. To study these
capabilities we propose ECLAIR, a system to automate enterprise workflows with
minimal human supervision. We conduct initial experiments showing that
multimodal FMs can address the limitations of traditional RPA with (1)
near-human-level understanding of workflows (93% accuracy on a workflow
understanding task) and (2) instant set-up with minimal technical barrier
(based solely on a natural language description of a workflow, ECLAIR achieves
end-to-end completion rates of 40%). We identify human-AI collaboration,
validation, and self-improvement as open challenges, and suggest ways they can
be solved with data management techniques. Code is available at:
https://github.com/HazyResearch/eclair-agents
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined