Decentralized Learning in Markov Games This publication appears in: IEEE Transactions on Systems, Man and Cybernetics (Part B: Cybernetics) Authors: P. Vrancx, K. Verbeeck and A. Nowé Volume: 38 Pages: 976-981 Publication Year: 2008
Abstract: Learning automata (LA) were recently shown to be valuable tools for designing multi-agent reinforcement learning algorithms. One of the principal contributions of the LA theory is that a set of decentralized independent LA is able to control a finite Markov chain with unknown transition probabilities and rewards. In this paper, we propose to extend this algorithm to Markov gamesa straightforward extension of single-agent Markov decision problems to distributed mult-iagent decision problems. We show that under the same ergodic assumptions of the original theorem, the extended algorithm will converge to a pure equilibrium point between agent policies.
|