0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Policy Iteration for Pareto-Optimal Policies in Stochastic Stackelberg Games

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In general-sum stochastic games, a stationary Stackelberg equilibrium (SSE) does not always exist, in which the leader maximizes leader's return for all the initial states when the follower takes the best response against the leader's policy. Existing methods of determining the SSEs require strong assumptions to guarantee the convergence and the coincidence of the limit with the SSE. Moreover, our analysis suggests that the performance at the fixed points of these methods is not reasonable when they are not SSEs. Herein, we introduced the concept of Pareto-optimality as a reasonable alternative to SSEs. We derive the policy improvement theorem for stochastic games with the best-response follower and propose an iterative algorithm to determine the Pareto-optimal policies based on it. Monotone improvement and convergence of the proposed approach are proved, and its convergence to SSEs is proved in a special case.

          Related collections

          Author and article information

          Journal
          07 May 2024
          Article
          2405.06689
          340bb6f7-fa49-4685-9626-afff7208861d

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          History
          Custom metadata
          21 pages
          cs.GT cs.LG cs.MA math.OC

          Numerical methods,Theoretical computer science,Artificial intelligence
          Numerical methods, Theoretical computer science, Artificial intelligence

          Comments

          Comment on this article