Multirobot Collaborative Pursuit Target Robot by Improved MADDPG

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Policy formulation is one of the main problems in multirobot systems, especially in multirobot pursuit-evasion scenarios, where both sparse rewards and random environment changes bring great difficulties to find better strategy. Existing multirobot decision-making methods mostly use environmental rewards to promote robots to complete the target task that cannot achieve good results. This paper proposes a multirobot pursuit method based on improved multiagent deep deterministic policy gradient (MADDPG), which solves the problem of sparse rewards in multirobot pursuit-evasion scenarios by combining the intrinsic reward and the external environment. The state similarity module based on the threshold constraint is as a part of the intrinsic reward signal output by the intrinsic curiosity module, which is used to balance overexploration and insufficient exploration, so that the agent can use the intrinsic reward more effectively to learn better strategies. The simulation experiment results show that the proposed method can improve the reward value of robots and the success rate of the pursuit task significantly. The intuitive change is obviously reflected in the real-time distance between the pursuer and the escapee, the pursuer using the improved algorithm for training can get closer to the escapee more quickly, and the average following distance also decreases.

Related collections

Most cited references 25

Record: found
Abstract: found
Article: not found

Human-level control through deep reinforcement learning.

Volodymyr Mnih, Koray Kavukcuoglu, David Silver … (2015)

The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

0 comments Cited 1593 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Curiosity-Driven Exploration by Self-Supervised Prediction

Deepak Pathak, Pulkit Agrawal, Alexei Efros … (2017)

0 comments Cited 98 times – based on 0 reviews

Bookmark

Record: found
Abstract: not found
Book: not found

Reinforcement learning: An introduction

S. Sutton R., S Richard, Sutton Richard … (2018)

0 comments Cited 56 times – based on 0 reviews

Bookmark

All references

Author and article information

Contributors

Yi He:

ORCID: https://orcid.org/0000-0002-3005-6772

Journal

Journal ID (nlm-ta): Comput Intell Neurosci

Journal ID (iso-abbrev): Comput Intell Neurosci

Journal ID (publisher-id): cin

Title: Computational Intelligence and Neuroscience

Publisher: Hindawi

ISSN (Print): 1687-5265

ISSN (Electronic): 1687-5273

Publication date Collection: 2022

Publication date (Electronic): 25 February 2022

Volume: 2022

Electronic Location Identifier: 4757394

Affiliations

¹School of Mechanical and Electronic Engineering, Wuhan University of Technology, Wuhan 430070, China

²Intelligent Transport Systems Research Center, Wuhan University of Technology, Wuhan 430063, China

Author notes

Academic Editor: Daqing Gong

Author information

Xiao Zhou https://orcid.org/0000-0002-3591-0982

Song Zhou https://orcid.org/0000-0002-6453-3805

Xingang Mou https://orcid.org/0000-0002-1265-8112

Yi He https://orcid.org/0000-0002-3005-6772

Article

DOI: 10.1155/2022/4757394

PMC ID: 8896963

PubMed ID: 35251150

SO-VID: efa0b5e4-abb5-4f74-bea1-212ec37ca3ce

License:

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 19 January 2022

Date accepted : 8 February 2022

Funding

Funded by: Research and Development

Award ID: 2021YFC3001502

Funded by: National Natural Science Foundation of China

Award ID: 52072292

Multirobot Collaborative Pursuit Target Robot by Improved MADDPG

Read this article at

Abstract

Related collections

Recursive Rule based Visual Categorization

Most cited references 25

Human-level control through deep reinforcement learning.

Curiosity-Driven Exploration by Self-Supervised Prediction

Reinforcement learning: An introduction

Author and article information

Contributors

Journal

Affiliations

Author notes

Author information

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 327

Cited by 1

Most referenced authors 200