CARL: A Synergistic Framework for Causal Reinforcement Learning

IEEE Access(2023)

引用 0|浏览17
暂无评分
摘要
Causal Reinforcement Learning (CRL) is an emerging field where two essential areas for the development of artificial intelligence are integrated. Existing works in the area have shown how causality can contribute to mitigate some of the limitations of reinforcement learning (RL), ranging from data-inefficiency, lack of interpretability, and long learning times, among others. However, how to use reinforcement learning to support causal discovery (CD) has so far been less explored. In this article, we introduce CARL, a Causality-Aware Reinforcement Learning framework for simultaneously learning and using causal models to speed-up the policy learning in online Markov decision process (MDP) settings. In a synergistic way, our method alternates between: (i) (RL for CD), where it promotes the selection of actions to obtain better causal models in fewer episodes than traditional methods of obtaining data in RL, (ii) (CD), where a score-based algorithm is used to learn causal models and (iii) (RL using CD), where the learned models are used to select actions that speed up the learning of the optimal policy by reducing the number of interactions with the environment. Experiments in simulated environments show that our method achieves better results in policy learning than traditional model-free and model-based algorithms while it is also able to learn the underlying causal models. We also demonstrate how the learned causal models can be directly transferred to a similar task of greater complexity, significantly reducing the number of episodes required to learn an optimal policy Finally, the method’s scalability to high-dimensional states, where the action-value function needs to be represented with deep neural networks, was verified.
更多
查看译文
关键词
causal reinforcement learning,synergistic framework
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要