Embodied Question Answering via Multi-LLM Systems
arxiv(2024)
摘要
Embodied Question Answering (EQA) is an important problem, which involves an
agent exploring the environment to answer user queries. In the existing
literature, EQA has exclusively been studied in single-agent scenarios, where
exploration can be time-consuming and costly. In this work, we consider EQA in
a multi-agent framework involving multiple large language models (LLM) based
agents independently answering queries about a household environment. To
generate one answer for each query, we use the individual responses to train a
Central Answer Model (CAM) that aggregates responses for a robust answer. Using
CAM, we observe a 50% higher EQA accuracy when compared against aggregation
methods for ensemble LLM, such as voting schemes and debates. CAM does not
require any form of agent communication, alleviating it from the associated
costs. We ablate CAM with various nonlinear (neural network, random forest,
decision tree, XGBoost) and linear (logistic regression classifier, SVM)
algorithms. Finally, we present a feature importance analysis for CAM via
permutation feature importance (PFI), quantifying CAMs reliance on each
independent agent and query context.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要