Chrome Extension
WeChat Mini Program
Use on ChatGLM

Sample-Efficient Learning for Industrial Assembly using Qgraph-bounded DDPG.

IROS(2020)

Cited 10|Views17
No score
Abstract
Recent progress in deep reinforcement learning has enabled agents to autonomously learn complex control strategies from scratch. Model-free approaches like Deep Deterministic Policy Gradients (DDPG) seem promising for applications with intricate dynamics, such as contact-rich manipulation tasks. However, these methods typically require large amounts of training data or meticulous hyperparameter tuning, limiting their usefulness for real-world robotics applications. In this paper, we evaluate and benchmark our recently proposed approach for improving model-free reinforcement learning with DDPG through Qgraph-based bounds in temporal difference learning. We directly apply the algorithm to a challenging real-world industrial insertion task and assess its performance (see https://youtu.be/Z_GcNbCWE-E). Empirical results show that the insertion task can be learned despite significant frictional forces and uncertainty, even in sparse-reward settings. We present an in-depth comparison based on a large number of experiments and demonstrate the advantages and performance of Qgraph-bounded DDPG: the learning process can be significantly sped up, robustified against bad choices of hyperparameters and runs with less memory requirements. Lastly, the presented results extend the current theoretical understanding of the link between data graph structure and soft divergence in DDPG.
More
Translated text
Key words
deep deterministic policy gradients,intricate dynamics,contact-rich manipulation tasks,hyperparameter tuning,real-world robotics applications,model-free reinforcement,temporal difference learning,real-world industrial insertion task,significant frictional forces,uncertainty,Qgraph-bounded DDPG,learning process,sample-efficient learning,industrial assembly,qgraph-bounded DDPG,deep reinforcement learning,complex control strategies,model-free approaches,data graph structure
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined