Chrome Extension
WeChat Mini Program
Use on ChatGLM

Act As You Learn: Adaptive Decision-Making in Non-Stationary Markov Decision Processes

AAMAS '24 Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems(2024)

Cited 0|Views9
No score
Abstract
A fundamental (and largely open) challenge in sequential decision-making isdealing with non-stationary environments, where exogenous environmentalconditions change over time. Such problems are traditionally modeled asnon-stationary Markov decision processes (NSMDP). However, existing approachesfor decision-making in NSMDPs have two major shortcomings: first, they assumethat the updated environmental dynamics at the current time are known (althoughfuture dynamics can change); and second, planning is largely pessimistic, i.e.,the agent acts “safely” to account for the non-stationary evolution of theenvironment. We argue that both these assumptions are invalid in practice –updated environmental conditions are rarely known, and as the agent interactswith the environment, it can learn about the updated dynamics and avoid beingpessimistic, at least in states whose dynamics it is confident about. Wepresent a heuristic search algorithm called Adaptive Monte Carlo TreeSearch (ADA-MCTS) that addresses these challenges. We show that the agent canlearn the updated dynamics of the environment over time and then act as itlearns, i.e., if the agent is in a region of the state space about which it hasupdated knowledge, it can avoid being pessimistic. To quantify “updatedknowledge,” we disintegrate the aleatoric and epistemic uncertainty in theagent's updated belief and show how the agent can use these estimates fordecision-making. We compare the proposed approach with the multiplestate-of-the-art approaches in decision-making across multiple well-establishedopen-source problems and empirically show that our approach is faster andhighly adaptive without sacrificing safety.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined