Learning Stackelberg Equilibria and Applications to Economic Design Games

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We study the use of reinforcement learning to learn the optimal leader's strategy in Stackelberg games. Learning a leader's strategy has an innate stationarity problem -- when optimizing the leader's strategy, the followers' strategies might shift. To circumvent this problem, we model the followers via no-regret dynamics to converge to a Bayesian Coarse-Correlated Equilibrium (B-CCE) of the game induced by the leader. We then embed the followers' no-regret dynamics in the leader's learning environment, which allows us to formulate our learning problem as a standard POMDP. We prove that the optimal policy of this POMDP achieves the same utility as the optimal leader's strategy in our Stackelberg game. We solve this POMDP using actor-critic methods, where the critic is given access to the joint information of all the agents. Finally, we show that our methods are able to learn optimal leader strategies in a variety of settings of increasing complexity, including indirect mechanisms where the leader's strategy is setting up the mechanism's rules.

Related collections

Author and article information

Journal

Publication date Created: 07 October 2022

Article

ArXiV ID: 2210.03852

SO-VID: 6f42a5e4-ec6b-446a-83ec-68859a174855

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.GT cs.MA

ScienceOpen disciplines: Theoretical computer science,Artificial intelligence

Data availability:

ScienceOpen disciplines: Theoretical computer science, Artificial intelligence

Learning Stackelberg Equilibria and Applications to Economic Design Games

Read this article at

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 226