Multi-Objective X-Armed Bandits Host Publication: 2014 IEEE World Congress on Computational Intelligence (WCCI) Authors: K. Marguerite Van Moffaert, K. Bert Van Vaerenbergh, P. Vrancx and A. Nowé UsePubPlace: China Publisher: IEEE Publication Date: Jul. 2014 Number of Pages: 8 ISBN: 978-1-4799-6627-1
Abstract: Many of the standard optimization algorithms focus on optimizing a single, scalar feedback signal. However, real-life optimization problems often require a simultaneous optimization of more than one objective. In this paper, we propose a multi-objective extension to the standard X-armed bandit problem. As the feedback signal is now vector-valued, the goal of the agent is to sample actions in the Pareto dominating area of the objective space. Therefore, we propose the multi-objective Hierarchical Optimistic Optimization strategy that discretizes the continuous action space in relation to the Pareto optimal solutions obtained in the multi-objective objective space. We experimentally validate the approach on two well-known multi-objective test functions and a simulation of a real life application, the filling phase of a wet clutch. We demonstrate that the strategy allows to identify the Pareto front after just a few epochs and to sample accordingly. After learning, several multi-objective quality indicators indicate that the set of sampled solutions by the algorithm very closely approximates the Pareto front.
|