Epistemic Side Effects: An AI Safety Problem

Toryn Q. Klassen,Parand Alizadeh Alamdari,Sheila A. McIlraith

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems（2023）

引用 0|浏览6

暂无评分

摘要

AI safety research has investigated the problem of negative side effects -- undesirable changes made by AI systems in pursuit of an underspecified objective. However, the focus has been on physical side effects, such as a robot breaking a vase while moving (when the objective makes no mention of the vase). In this paper we introduce the notion of epistemic side effects, which are side effects on the knowledge or beliefs of agents. Epistemic side effects are most pertinent in a (partially observable) multiagent setting. We show that we can extend an existing approach to avoiding (physical) side effects in reinforcement learning to also avoid some epistemic side effects in certain cases. Nonetheless, avoiding negative epistemic side effects remains an important challenge, and we identify some key research problems.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要