Improving Generalization of Reinforcement Learning Using a Bilinear Policy Network.

Fen Fang,Wenyu Liang,Yan Wu,Qianli Xu,Joo-Hwee Lim

ICIP（2022）

Cited 0|Views16

No score

Abstract

In deep reinforcement learning (DRL), the agent is usually trained on seen environments by optimizing a policy network. However, it is difficult to be generalized to unseen environments properly, even when the environmental variations are insignificant. This is partly because the policy network cannot effectively learn the representation of visual difference that is subtle among highly similar states in the environments. Because a bilinear structured model containing two feature extractors allows pairwise feature interactions in a translationally invariant manner which makes it particularly useful for subtle difference recognition among highly similar states, in this work, a bilinear policy network is employed to enhance representation learning, and thus to improve generalization of the DRL. The proposed bilinear policy network is tested on various DRL task, including a control task on path planning for active object detection, and Grid World, an AI game task. The test results show that the generalization of DRL can be improved by the proposed network.

Translated text

Key words

bilinear policy network,reinforcement learning,generalization

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined