Delayed reinforcement learning converges to intermittent control for human quiet stance

•Explored neural control of human quiet stance using reinforcement learning.•Found intermittent, not continuous, neural control in optimal strategies.•Delayed neural signals influenced feedback gains and system dynamics.•Model emulates brain adaptability and energy efficiency during standing.•Insigh...

Full description

Saved in:
Bibliographic Details
Published in:Medical engineering & physics Vol. 130; p. 104197
Main Authors: Zhao, Yongkun, Hodossy, Balint K., Jing, Shibo, Todoh, Masahiro, Farina, Dario
Format: Journal Article
Language:English
Published: England Elsevier Ltd 01-08-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Explored neural control of human quiet stance using reinforcement learning.•Found intermittent, not continuous, neural control in optimal strategies.•Delayed neural signals influenced feedback gains and system dynamics.•Model emulates brain adaptability and energy efficiency during standing.•Insights offer fresh perspective on neural feedback mechanisms of stance. The neural control of human quiet stance remains controversial, with classic views suggesting a limited role of the brain and recent findings conversely indicating direct cortical control of muscles during upright posture. Conceptual neural feedback control models have been proposed and tested against experimental evidence. The most renowned model is the continuous impedance control model. However, when time delays are included in this model to simulate neural transmission, the continuous controller becomes unstable. Another model, the intermittent control model, assumes that the central nervous system (CNS) activates muscles intermittently, and not continuously, to counteract gravitational torque. In this study, a delayed reinforcement learning algorithm was developed to seek optimal control policy to balance a one-segment inverted pendulum model representing the human body. According to this approach, there was no a-priori strategy imposed on the controller but rather the optimal strategy emerged from the reward-based learning. The simulation results indicated that the optimal neural controller exhibits intermittent, and not continuous, characteristics, in agreement with the possibility that the CNS intermittently provides neural feedback torque to maintain an upright posture.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1350-4533
1873-4030
1873-4030
DOI:10.1016/j.medengphy.2024.104197