1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Adaptive Decision-Making for Automated Vehicles Under Roundabout Scenarios Using Optimization Embedded Reinforcement Learning

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references36

          • Record: found
          • Abstract: found
          • Article: not found

          Continuous control with deep reinforcement learning

          We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving. Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives. We further demonstrate that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs. 10 pages + supplementary
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found
            Is Open Access

            Asynchronous Methods for Deep Reinforcement Learning

            We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Deterministic policy gradient algorithms

                Bookmark

                Author and article information

                Contributors
                Journal
                IEEE Transactions on Neural Networks and Learning Systems
                IEEE Trans. Neural Netw. Learning Syst.
                Institute of Electrical and Electronics Engineers (IEEE)
                2162-237X
                2162-2388
                December 2021
                December 2021
                : 32
                : 12
                : 5526-5538
                Article
                10.1109/TNNLS.2020.3042981
                89bd2a71-b892-41b7-a707-421b05df70c4
                © 2021

                https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html

                https://doi.org/10.15223/policy-029

                https://doi.org/10.15223/policy-037

                History

                Comments

                Comment on this article