Hypervolume-based Multi-Objective Reinforcement Learning Host Publication: Evolutionary Multi-Criterion Optimization (EMO 2013) Authors: K. Marguerite Van Moffaert, M. Drugan and A. Nowé Publisher: Springer Publication Date: Mar. 2013 Number of Pages: 15 ISBN: 978-3-642-37139-4
Abstract: Indicator-basedevolutionaryalgorithmsareamongstthebest performing methods for solving multi-objective optimization (MOO) problems. In reinforcement learning (RL), introducing a quality indicator in an algorithm's decision logic was not attempted before. In this paper, we propose a novel on-line multi-objective reinforcement learning (MORL) algorithm that uses the hypervolume indicator as an action selection strategy. We call this algorithm the hypervolume-based MORL algorithm or HB-MORL and conduct an empirical study of the performance of the algorithm using multiple quality assessment metrics from multi- objective optimization. We compare the hypervolume-based learning algorithm on different environments to two multi-objective algorithms that rely on scalarization techniques, such as the linear scalarization and the weighted Chebyshev function. We conclude that HB-MORL significantly outperforms the linear scalarization method and performs similarly to the Chebyshev algorithm without requiring any user-specified emphasis on particular objectives.
|