|
D. M Roijers, L. Zintgraf, P. Libin, M. Reymond, E. Bargiacchi and A. Nowé, Interactive Multi-Objective Reinforcement Learning in Multi-Armed Bandits with Gaussian Process Utility Models, in European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020, Springer, 2020, pp. 16.
|
|