0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Towards biologically plausible model-based reinforcement learning in recurrent spiking networks by dreaming new experiences

      research-article
      1 , 2 , , 1
      Scientific Reports
      Nature Publishing Group UK
      Neuroscience, Learning algorithms

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Humans and animals can learn new skills after practicing for a few hours, while current reinforcement learning algorithms require a large amount of data to achieve good performances. Recent model-based approaches show promising results by reducing the number of necessary interactions with the environment to learn a desirable policy. However, these methods require biological implausible ingredients, such as the detailed storage of older experiences, and long periods of offline learning. The optimal way to learn and exploit world-models is still an open question. Taking inspiration from biology, we suggest that dreaming might be an efficient expedient to use an inner model. We propose a two-module (agent and model) spiking neural network in which “dreaming” (living new experiences in a model-based simulated environment) significantly boosts learning. Importantly, our model does not require the detailed storage of experiences, and learns online the world-model and the policy. Moreover, we stress that our network is composed of spiking neurons, further increasing the biological plausibility and implementability in neuromorphic hardware.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: not found
          • Book: not found

          Reinforcement Learning: An Introduction

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Mastering Atari, Go, chess and shogi by planning with a learned model

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity.

              The persistent modification of synaptic efficacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as spike-timing-dependent plasticity (STDP). Here we show that the modulation of STDP by a global reward signal leads to reinforcement learning. We first derive analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intrinsic plasticity, by applying a reinforcement learning algorithm to the stochastic spike response model of spiking neurons. These rules have several features common to plasticity mechanisms experimentally found in the brain. We then demonstrate in simulations of networks of integrate-and-fire neurons the efficacy of two simple learning rules involving modulated STDP. One rule is a direct extension of the standard STDP model (modulated STDP), and the other one involves an eligibility trace stored at each synapse that keeps a decaying memory of the relationships between the recent pairs of pre- and postsynaptic spike pairs (modulated STDP with eligibility trace). This latter rule permits learning even if the reward signal is delayed. The proposed rules are able to solve the XOR problem with both rate coded and temporally coded input and to learn a target output firing-rate pattern. These learning rules are biologically plausible, may be used for training generic artificial spiking neural networks, regardless of the neural model used, and suggest the experimental investigation in animals of the existence of reward-modulated STDP.
                Bookmark

                Author and article information

                Contributors
                cristiano0capone@gmail.com
                Journal
                Sci Rep
                Sci Rep
                Scientific Reports
                Nature Publishing Group UK (London )
                2045-2322
                25 June 2024
                25 June 2024
                2024
                : 14
                : 14656
                Affiliations
                [1 ]GRID grid.470218.8, INFN, Sezione di Roma, ; Rome, RM 00185 Italy
                [2 ]Present Address: Natl. Center for Radiation Protection and Computational Physics, Istituto Superiore di Sanitá, ( https://ror.org/02hssy432) Rome, 00161 Italy
                Article
                65631
                10.1038/s41598-024-65631-y
                11199658
                38918553
                98cee28f-b525-4932-925d-ca4a3754fa6e
                © The Author(s) 2024

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 5 February 2024
                : 21 June 2024
                Categories
                Article
                Custom metadata
                © Springer Nature Limited 2024

                Uncategorized
                neuroscience,learning algorithms
                Uncategorized
                neuroscience, learning algorithms

                Comments

                Comment on this article