Chrome Extension
WeChat Mini Program
Use on ChatGLM

Learning within a Markovian Environment

msra(2008)

Cited 25|Views3
No score
Abstract
We investigate the behavior of two learning rules, Stochastic Best Response (SBR) and Replicator Dynamics, in a model with aggregate and time-correlated shocks to payos. The main dierence between the two behavior of the two rules is that under SBR corners are not absorbing. We study a setting where there are two actions and many states of nature and the transition between states follows a Markov chain. We find that the SBR converges to a behavior similar to probability matching. On the other hand, the Replicator Dynamics selects the optimal action only if the average payo of both actions is dierent enough.
More
Translated text
Key words
probability matching.,markov chains,adaptive learning,stochastic best response,replicator dynamics,mixed strategy,markov chain,reinforcement learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined