Name: Reward Mapping for Transfer in Long-Lived Agents
Start: 2013-12-08T14:00:00-0800
End: 2013-12-08T18:00:00-0800

Back To Schedule

Reward Mapping for Transfer in Long-Lived Agents

We consider how to transfer knowledge from previous tasks to a current task in long-lived and bounded agents that must solve a sequence of MDPs over a finite lifetime. A novel aspect of our transfer approach is that we reuse reward functions. While this may seem counterintuitive, we build on the insight of recent work on the optimal rewards problem that guiding an agent's behavior with reward functions other than the task-specifying reward function can help overcome computational bounds of the agent. Specifically, we use good guidance reward functions learned on previous tasks in the sequence to incrementally train a reward mapping function that maps task-specifying reward functions into good initial guidance reward functions for subsequent tasks. We demonstrate that our approach can substantially improve the agent's performance relative to other approaches, including an approach that transfers policies.
None

Speakers

posterid Sun62
location Poster# Sun62

NIPS 2013

Xiaoxiao Guo

Richard Lewis

Satinder Singh

NIPS 2013

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Xiaoxiao Guo

Richard Lewis

Satinder Singh