Hierarchical Multiagent Formation Control Scheme via Actor-Critic Learning
This article presents a nearly optimal solution to the cooperative formation control problem for large-scale multiagent system (MAS). First, multigroup technique is widely used for the decomposition of the large-scale problem, but there is no consensus between different subgroups. Inspired by the hi...
Saved in:
Published in: | IEEE transaction on neural networks and learning systems Vol. 34; no. 11; pp. 8764 - 8777 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
United States
IEEE
01-11-2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This article presents a nearly optimal solution to the cooperative formation control problem for large-scale multiagent system (MAS). First, multigroup technique is widely used for the decomposition of the large-scale problem, but there is no consensus between different subgroups. Inspired by the hierarchical structure applied in the MAS, a hierarchical leader-following formation control structure with multigroup technique is constructed, where two layers and three types of agents are designed. Second, adaptive dynamic programming technique is conformed to the optimal formation control problem by the establishment of performance index function. Based on the traditional generalized policy iteration (PI) algorithm, the multistep generalized policy iteration (MsGPI) is developed with the modification of policy evaluation. The novel algorithm not only inherits the advantages of high convergence speed and low computational complexity in the generalized PI algorithm but also further accelerates the convergence speed and reduces run time. Besides, the stability analysis, convergence analysis, and optimality analysis are given for the proposed multistep PI algorithm. Afterward, a neural network-based actor-critic structure is built for approximating the iterative control policies and value functions. Finally, a large-scale formation control problem is provided to demonstrate the performance of our developed hierarchical leader-following formation control structure and MsGPI algorithm. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 2162-237X 2162-2388 |
DOI: | 10.1109/TNNLS.2022.3153028 |