Hierarchical Multiagent Formation Control Scheme via Actor-Critic Learning

This article presents a nearly optimal solution to the cooperative formation control problem for large-scale multiagent system (MAS). First, multigroup technique is widely used for the decomposition of the large-scale problem, but there is no consensus between different subgroups. Inspired by the hi...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transaction on neural networks and learning systems Vol. 34; no. 11; pp. 8764 - 8777
Main Authors:	Mu, Chaoxu, Peng, Jiangwen, Sun, Changyin
Format:	Journal Article
Language:	English
Published:	United States IEEE 01-11-2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Adaptive control Adaptive dynamic programming (ADP) Algorithms Computational complexity Convergence Cooperative control Dynamic programming Games Heuristic algorithms hierarchical formation control (HFC) Hybrid fiber coaxial cables Iterative methods Microgrids multiagent system (MAS) Multiagent systems multistep generalized policy iteration (MsGPI) Neural networks neural networks (NNs) Optimization Performance indices Scale formation Stability analysis Subgroups
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This article presents a nearly optimal solution to the cooperative formation control problem for large-scale multiagent system (MAS). First, multigroup technique is widely used for the decomposition of the large-scale problem, but there is no consensus between different subgroups. Inspired by the hierarchical structure applied in the MAS, a hierarchical leader-following formation control structure with multigroup technique is constructed, where two layers and three types of agents are designed. Second, adaptive dynamic programming technique is conformed to the optimal formation control problem by the establishment of performance index function. Based on the traditional generalized policy iteration (PI) algorithm, the multistep generalized policy iteration (MsGPI) is developed with the modification of policy evaluation. The novel algorithm not only inherits the advantages of high convergence speed and low computational complexity in the generalized PI algorithm but also further accelerates the convergence speed and reduces run time. Besides, the stability analysis, convergence analysis, and optimality analysis are given for the proposed multistep PI algorithm. Afterward, a neural network-based actor-critic structure is built for approximating the iterative control policies and value functions. Finally, a large-scale formation control problem is provided to demonstrate the performance of our developed hierarchical leader-following formation control structure and MsGPI algorithm.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2162-237X 2162-2388
DOI:	10.1109/TNNLS.2022.3153028