State-Following-Kernel-Based Online Reinforcement Learning Guidance Law Against Maneuvering Target

In this article, a state-following-kernel-based reinforcement learning method with an extended disturbance observer is proposed, whose application to a missile-target interception system is considered. First, the missile-target engagement is formulated as a vertical planar pursuit-evasion problem. T...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on aerospace and electronic systems Vol. 58; no. 6; pp. 5784 - 5797
Main Authors:	Peng, Chi, Zhang, Hanwen, He, Yongxiang, Ma, Jianjun
Format:	Journal Article
Language:	English
Published:	New York IEEE 01-12-2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Approximation Artificial neural networks Disturbance observers Extended disturbance observer (EDO) Guidance (motion) guidance law Interception Iterative methods Kernel Kernels Learning Maneuvering targets Mathematical analysis Mathematical models Missiles model-based reinforcement learning (RL) Neural networks Real-time systems Reinforcement learning state-following kernel
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In this article, a state-following-kernel-based reinforcement learning method with an extended disturbance observer is proposed, whose application to a missile-target interception system is considered. First, the missile-target engagement is formulated as a vertical planar pursuit-evasion problem. The target maneuver is then estimated by an extended disturbance observer in real time, which leads to an infinite-horizon optimal regulation problem. Next, utilizing the local state approximation ability of state-following kernels, the critic neural network (NN) and actor NN for synchronous iteration are constructed to calculate the approximate optimal guidance policy. The states and NN weights are proven to be uniformly ultimately bounded using the Lyapunov method. Finally, numerical simulations against different types of nonstationary targets are effectively tested, and the results highlight the role of state-following kernels in the value function and policy approximation.
ISSN:	0018-9251 1557-9603
DOI:	10.1109/TAES.2022.3178770