PAC Reinforcement Learning Algorithm for General-Sum Markov Games.

Ashkan ZehfrooshHerbert G Tanner

Published in: IEEE transactions on automatic control (2022)

This paper presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. Using the idea of delayed Q-learning, the paper extends the well-known Nash Q-learning algorithm to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PACMARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results demonstrate the algorithm's performance and robustness.

Keyphrases

machine learning
deep learning
neural network
virtual reality