Reinforcement-Learning-Based Output-Feedback Control of Nonstrict Nonlinear Discrete-Time Systems With Application to Engine Emission Control

A novel reinforcement-learning-based output adaptive neural network (NN) controller, which is also referred to as the adaptive-critic NN controller, is developed to deliver the desired tracking performance for a class of nonlinear discrete-time systems expressed in nonstrict feedback form in the pre...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on systems, man and cybernetics. Part B, Cybernetics Vol. 39; no. 5; pp. 1162 - 1179
Main Authors:	Shih, P., Kaul, B.C., Jagannathan, S., Drallmeier, J.A.
Format:	Journal Article
Language:	English
Published:	United States IEEE 01-10-2009
Subjects:	Adaptive control Adaptive critic Adaptive systems Algorithms Artificial Intelligence Biomimetics - methods Computer Simulation Control systems Cybernetics discrete-time system Electric Power Supplies engine emission control Engines Estimates Feedback Models, Theoretical Neural networks Neurofeedback Nonlinear control systems Nonlinear Dynamics Nonlinearity nonstrict nonlinear output feedback Observers Output feedback Pattern Recognition, Automated - methods Programmable control Reinforcement (Psychology) reinforcement learning control Signal Processing, Computer-Assisted Utilities Vehicle Emissions - analysis Vehicle Emissions - prevention & control
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	A novel reinforcement-learning-based output adaptive neural network (NN) controller, which is also referred to as the adaptive-critic NN controller, is developed to deliver the desired tracking performance for a class of nonlinear discrete-time systems expressed in nonstrict feedback form in the presence of bounded and unknown disturbances. The adaptive-critic NN controller consists of an observer, a critic, and two action NNs. The observer estimates the states and output, and the two action NNs provide virtual and actual control inputs to the nonlinear discrete-time system. The critic approximates a certain strategic utility function, and the action NNs minimize the strategic utility function and control inputs. All NN weights adapt online toward minimization of a performance index, utilizing the gradient-descent-based rule, in contrast with iteration-based adaptive-critic schemes. Lyapunov functions are used to show the stability of the closed-loop tracking error, weights, and observer estimates. Separation and certainty equivalence principles, persistency of excitation condition, and linearity in the unknown parameter assumption are not needed. Experimental results on a spark ignition (SI) engine operating lean at an equivalence ratio of 0.75 show a significant (25%) reduction in cyclic dispersion in heat release with control, while the average fuel input changes by less than 1% compared with the uncontrolled case. Consequently, oxides of nitrogen (NO x ) drop by 30%, and unburned hydrocarbons drop by 16% with control. Overall, NO x 's are reduced by over 80% compared with stoichiometric levels.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1083-4419 1941-0492
DOI:	10.1109/TSMCB.2009.2013272