Reinforcement-Learning-Based Output-Feedback Control of Nonstrict Nonlinear Discrete-Time Systems With Application to Engine Emission Control
A novel reinforcement-learning-based output adaptive neural network (NN) controller, which is also referred to as the adaptive-critic NN controller, is developed to deliver the desired tracking performance for a class of nonlinear discrete-time systems expressed in nonstrict feedback form in the pre...
Saved in:
Published in: | IEEE transactions on systems, man and cybernetics. Part B, Cybernetics Vol. 39; no. 5; pp. 1162 - 1179 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
United States
IEEE
01-10-2009
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | A novel reinforcement-learning-based output adaptive neural network (NN) controller, which is also referred to as the adaptive-critic NN controller, is developed to deliver the desired tracking performance for a class of nonlinear discrete-time systems expressed in nonstrict feedback form in the presence of bounded and unknown disturbances. The adaptive-critic NN controller consists of an observer, a critic, and two action NNs. The observer estimates the states and output, and the two action NNs provide virtual and actual control inputs to the nonlinear discrete-time system. The critic approximates a certain strategic utility function, and the action NNs minimize the strategic utility function and control inputs. All NN weights adapt online toward minimization of a performance index, utilizing the gradient-descent-based rule, in contrast with iteration-based adaptive-critic schemes. Lyapunov functions are used to show the stability of the closed-loop tracking error, weights, and observer estimates. Separation and certainty equivalence principles, persistency of excitation condition, and linearity in the unknown parameter assumption are not needed. Experimental results on a spark ignition (SI) engine operating lean at an equivalence ratio of 0.75 show a significant (25%) reduction in cyclic dispersion in heat release with control, while the average fuel input changes by less than 1% compared with the uncontrolled case. Consequently, oxides of nitrogen (NO x ) drop by 30%, and unburned hydrocarbons drop by 16% with control. Overall, NO x 's are reduced by over 80% compared with stoichiometric levels. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1083-4419 1941-0492 |
DOI: | 10.1109/TSMCB.2009.2013272 |