AutoDroid: LLM-powered Task Automation in Android
arxiv(2023)
摘要
Mobile task automation is an attractive technique that aims to enable
voice-based hands-free user interaction with smartphones. However, existing
approaches suffer from poor scalability due to the limited language
understanding ability and the non-trivial manual efforts required from
developers or end-users. The recent advance of large language models (LLMs) in
language understanding and reasoning inspires us to rethink the problem from a
model-centric perspective, where task preparation, comprehension, and execution
are handled by a unified language model. In this work, we introduce AutoDroid,
a mobile task automation system capable of handling arbitrary tasks on any
Android application without manual efforts. The key insight is to combine the
commonsense knowledge of LLMs and domain-specific knowledge of apps through
automated dynamic analysis. The main components include a functionality-aware
UI representation method that bridges the UI with the LLM, exploration-based
memory injection techniques that augment the app-specific domain knowledge of
LLM, and a multi-granularity query optimization module that reduces the cost of
model inference. We integrate AutoDroid with off-the-shelf LLMs including
online GPT-4/GPT-3.5 and on-device Vicuna, and evaluate its performance on a
new benchmark for memory-augmented Android task automation with 158 common
tasks. The results demonstrated that AutoDroid is able to precisely generate
actions with an accuracy of 90.9
71.3
benchmark suites, and source code of AutoDroid will be released at
urlhttps://autodroid-sys.github.io/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要