Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess

      1 , 1 , 1 , 1 , 1
      Complexity
      Hindawi Limited

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this study, hybrid state-action-reward-state-action (SARSA (λ)) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning strategy that uses SARSA (λ) and Q-learning algorithms combining domain knowledge for a feedback function for layout and battle stages is proposed. An improved deep neural network based on ResNet18 is used for self-play training. Experimental results show that hybrid online and offline reinforcement learning with a deep neural network can improve the game program’s learning efficiency and understanding ability for Tibetan Jiu chess.

          Related collections

          Most cited references17

          • Record: found
          • Abstract: not found
          • Article: not found

          Learning to predict by the methods of temporal differences

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Deep Blue

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Superhuman AI for heads-up no-limit poker: Libratus beats top professionals

              No-limit Texas hold’em is the most popular form of poker. Despite AI successes in perfect-information games, the private information and massive game tree have made no-limit poker difficult to tackle. We present Libratus, an AI that, in a 120,000-hand competition, defeated four top human specialist professionals in heads-up no-limit Texas hold’em, the leading benchmark and long-standing challenge problem in imperfect-information game solving. Our game-theoretic approach features application-independent techniques: an algorithm for computing a blueprint for the overall strategy, an algorithm that fleshes out the details of the strategy for subgames that are reached during play, and a self-improver algorithm that fixes potential weaknesses that opponents have identified in the blueprint strategy.
                Bookmark

                Author and article information

                Journal
                Complexity
                Complexity
                Hindawi Limited
                1076-2787
                1099-0526
                May 11 2020
                May 11 2020
                : 2020
                : 1-11
                Affiliations
                [1 ]School of Information and Engineering, Minzu University of China, Beijing 100081, China
                Article
                10.1155/2020/4708075
                de514eb9-749f-4804-837f-63eeaec37378
                © 2020

                http://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content583

                Most referenced authors181