Data-Driven Knowledge Transfer in Batch Q^* Learning
SSRN Electronic Journal(2024)
摘要
In data-driven decision-making in marketing, healthcare, and education, it is
desirable to utilize a large amount of data from existing ventures to navigate
high-dimensional feature spaces and address data scarcity in new ventures. We
explore knowledge transfer in dynamic decision-making by concentrating on batch
stationary environments and formally defining task discrepancies through the
lens of Markov decision processes (MDPs). We propose a framework of Transferred
Fitted Q-Iteration algorithm with general function approximation, enabling
the direct estimation of the optimal action-state function Q^* using both
target and source data. We establish the relationship between statistical
performance and MDP task discrepancy under sieve approximation, shedding light
on the impact of source and target sample sizes and task discrepancy on the
effectiveness of knowledge transfer. We show that the final learning error of
the Q^* function is significantly improved from the single task rate both
theoretically and empirically.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要