π2vec: Policy Representations with Successor Features
CoRR(2023)
Abstract
This paper describes π2vec, a method for representing behaviors of
black box policies as feature vectors. The policy representations capture how
the statistics of foundation model features change in response to the policy
behavior in a task agnostic way, and can be trained from offline data, allowing
them to be used in offline policy selection. This work provides a key piece of
a recipe for fusing together three modern lines of research: Offline policy
evaluation as a counterpart to offline RL, foundation models as generic and
powerful state representations, and efficient policy selection in resource
constrained environments.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined