Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors Host Publication: BNAIC 2019 Authors: H. Plisnier, D. Steckelmacher, D. Roijers and A. Nowé Publication Date: Nov. 2019 Number of Pages: 16
Abstract: Sample-efficiency is crucial in reinforcement learning tasks, especially when a large number of similar yet distinct tasks have to be learned. For example, consider a smart wheelchair learning to exit many differently-furnished offices on a building floor. Sequentially learning each of these tasks from scratch would be highly inefficient. A step towards a satisfying solution is the use of transfer learning: exploiting the knowledge acquired in previous (or source) tasks to tackle new (or target) tasks. Existing work mainly focuses on exploiting only one source policy as an advisor for the fresh agent, even when there are several expert source policies available. However, using only one advisor requires artificial mechanisms to limit its influence in areas where the source task and the target task differ, in order for the advisee not to be misled. In this paper, we present a novel approach to transfer learning in which all available source policies are exploited to help learn several related new tasks. Moreover, our approach is compatible with tasks that differ by their transition functions, which is rarely considered in the transfer reinforcement learning literature. Our in-depth empirical evaluation demonstrates that our approach significantly improves sample-efficiency.
|