Grounding Language Plans in Demonstrations Through Counter-Factual Perturbations
ICLR 2024(2024)
Abstract
Grounding the abstract knowledge captured by Large Language Models (LLMs) in
physical domains remains a pivotal yet unsolved problem. Whereas prior works
have largely focused on leveraging LLMs for generating abstract plans in symbolic
spaces, this work uses LLMs to guide the learning for structures and constraints
in robot manipulation tasks. Specifically, we borrow from manipulation plan-
ning literature the concept of mode families, defining specific types of motion
constraints among sets of objects, to serve as an intermediate layer that connects
high-level language representations with low-level physical trajectories. By lo-
cally perturbing a small set of successful human demonstrations, we augment the
dataset with additional successful executions as well as counterfactuals that fail
the task. Our explanation-based learning framework trains neural network-based
classifiers to differentiate success task executions from failures and as a by-product
learns classifiers that ground low-level states into mode families without dense
labeling. This further enables us to learn structured policies for the target task.
Experimental validation in both 2D continuous-space and robotic manipulation
environments demonstrates the robustness of our mode-based
imitation methods under external perturbations.
MoreTranslated text
Key words
Grounding LLM,Learning Mode Abstractions for Manipulation,Learning from Demonstration,Robotics,Task and Motion Planning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined