ETRO-VUB Department of Electronics and Informatics

About ETRO | News | Events | Vacancies | Contact

ETRO Publications

Full Details


	Conference Publication


	Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics Host Publication: European Conference on Machine Learning 2019 Authors: D. Steckelmacher, H. Plisnier, D. Roijers and A. Nowé Publisher: Springer Publication Year: 2020 Number of Pages: 16 ISBN: 978-3-030-46132-4 Abstract: Value-based reinforcement-learning algorithms provide state-of-the-art results in model-free discrete-action settings, and tend to outperform actor-critic algorithms. We argue that actor-critic algorithms are limited by their need for an on-policy critic. We propose Bootstrapped Dual Policy Iteration (BDPI), a novel model-free reinforcement-learning algorithm for continuous states and discrete actions, with an actor and several off-policy critics. Off-policy critics are compatible with experience replay, ensuring high sample-efficiency, without the need for off-policy corrections. The actor, by slowly imitating the average greedy policy of the critics, leads to high-quality and state-specific exploration, which we compare to Thompson sampling. Because the actor and critics are fully decoupled, BDPI is remarkably stable, and unusually robust to its hyper-parameters. BDPI is significantly more sample-efficient than Bootstrapped DQN, PPO, and ACKTR, on discrete, continuous and pixel-based tasks. External Link.

	Other Reference Styles

	Full Details IEEE Style BibTex Style EndNote Style

Search ETRO Publications

Author:
Keyword:
Type:	Journals Conferences Books Reports Laymen Other


	Research - Contact person - IRIS - AVSP - LAMI	Education - Contact person - Thesis proposals - ETRO Courses	Industry - Contact person - Spin-offs - Know How	Publications - Journals - Conferences - Books	About ETRO - Vacancies - News - Events - Press	Contact ETRO Department Tel: +32 2 629 29 30


	©2024 • Vrije Universiteit Brussel • ETRO Dept. • Pleinlaan 2 • 1050 Brussels • Tel: +32 2 629 2930 (secretariat) • Fax: +32 2 629 2883 • Webmaster • Disclaimer