1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants—even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.

          Author summary

          Reinforcement learning unifies neuroscience and AI with a universal computational framework for motivated behavior. Humans and robots alike are active and embodied agents who physically interact with the world and learn from feedback to guide future actions while weighing costs of time and energy. Initially, the modeling here attempted to identify learning algorithms for an interactive environment structured with patterns in counterfactual information that a human brain could learn to generalize. However, behavioral analysis revealed that a wider scope was necessary to identify individual differences in not only complex learning but also action bias and hysteresis. Sequential choices in the pursuit of rewards were clearly influenced by endogenous action preferences and persistent bias effects from action history causing repetition or alternation of previous actions. By modeling a modular brain as a mixture of expert and nonexpert systems for behavioral control, a distinct profile could be characterized for each individual attempting the experiment. Even for actions as simple as button pressing, effects specific to actions were as substantial as the effects from reward outcomes that decisions were supposed to follow from. Bias and hysteresis are concluded to be ubiquitous and intertwined with processes of active reinforcement learning for efficiency in behavior.

          Related collections

          Most cited references429

          • Record: found
          • Abstract: found
          • Article: not found

          Deep learning.

          Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Long Short-Term Memory

            Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A new look at the statistical model identification

              IEEE Transactions on Automatic Control, 19(6), 716-723
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Formal analysisRole: InvestigationRole: MethodologyRole: VisualizationRole: Writing – original draft
                Role: Funding acquisitionRole: SupervisionRole: Writing – original draft
                Role: Funding acquisitionRole: SupervisionRole: Writing – original draft
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput Biol
                plos
                PLOS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                29 March 2024
                March 2024
                : 20
                : 3
                : e1011950
                Affiliations
                [1 ] Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
                [2 ] Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
                [3 ] Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
                Ecole Normale Superieure, FRANCE
                Author notes

                The authors have declared that no competing interests exist.

                ‡ These authors are joint senior authors on this work.

                Author information
                https://orcid.org/0000-0003-1872-7614
                https://orcid.org/0000-0003-0016-3531
                https://orcid.org/0000-0003-4015-3151
                Article
                PCOMPBIOL-D-23-01464
                10.1371/journal.pcbi.1011950
                10980507
                38552190
                15ca2134-df4b-4c3c-b897-9d59ef9cc634
                © 2024 Colas et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 13 September 2023
                : 26 February 2024
                Page count
                Figures: 14, Tables: 4, Pages: 70
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/100000183, Army Research Office;
                Award ID: W911NF‑19‑2‑0026
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000183, Army Research Office;
                Award ID: W911NF‑16‑1‑0474
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000026, National Institute on Drug Abuse;
                Award ID: R01 DA040011
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000025, National Institute of Mental Health;
                Award ID: P50 MH094258
                Award Recipient :
                STG was supported by the Institute for Collaborative Biotechnologies under Cooperative Agreement W911NF‑19‑2‑0026 and grant W911NF‑16‑1‑0474 from the Army Research Office. JPOD was supported by National Institute on Drug Abuse grant R01 DA040011 and the National Institute of Mental Health’s Caltech Conte Center for Social Decision Making (P50 MH094258). The funders had no role in study design, data collection and analysis, the decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Neuroscience
                Cognitive Science
                Cognitive Psychology
                Learning
                Biology and Life Sciences
                Psychology
                Cognitive Psychology
                Learning
                Social Sciences
                Psychology
                Cognitive Psychology
                Learning
                Biology and Life Sciences
                Neuroscience
                Learning and Memory
                Learning
                Biology and Life Sciences
                Neuroscience
                Cognitive Science
                Cognitive Psychology
                Decision Making
                Biology and Life Sciences
                Psychology
                Cognitive Psychology
                Decision Making
                Social Sciences
                Psychology
                Cognitive Psychology
                Decision Making
                Biology and Life Sciences
                Neuroscience
                Cognitive Science
                Cognition
                Decision Making
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Research and Analysis Methods
                Simulation and Modeling
                Biology and Life Sciences
                Neuroscience
                Cognitive Science
                Cognitive Psychology
                Learning
                Human Learning
                Biology and Life Sciences
                Psychology
                Cognitive Psychology
                Learning
                Human Learning
                Social Sciences
                Psychology
                Cognitive Psychology
                Learning
                Human Learning
                Biology and Life Sciences
                Neuroscience
                Learning and Memory
                Learning
                Human Learning
                Biology and Life Sciences
                Psychology
                Behavior
                Imitation
                Social Sciences
                Psychology
                Behavior
                Imitation
                Biology and Life Sciences
                Psychology
                Behavior
                Social Sciences
                Psychology
                Behavior
                Biology and Life Sciences
                Neuroscience
                Cognitive Science
                Cognitive Psychology
                Perception
                Sensory Perception
                Biology and Life Sciences
                Psychology
                Cognitive Psychology
                Perception
                Sensory Perception
                Social Sciences
                Psychology
                Cognitive Psychology
                Perception
                Sensory Perception
                Biology and Life Sciences
                Neuroscience
                Sensory Perception
                Custom metadata
                All relevant data are within the manuscript and its Supporting Information files.

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article