Minimal Gated Unit for Recurrent Neural Networks

Recurrent neural networks (RNN) have been very successful in handling sequence data. However, understanding RNN and finding the best practices for RNN learning is a difficult task, partly because there are many competing and complex hidden units, such as the long short-term memory (LSTM) and the gat...

Full description

Saved in:
Bibliographic Details
Published in:International journal of automation and computing Vol. 13; no. 3; pp. 226 - 234
Main Authors: Zhou, Guo-Bing, Wu, Jianxin, Zhang, Chen-Lin, Zhou, Zhi-Hua
Format: Journal Article
Language:English
Published: Beijing Institute of Automation, Chinese Academy of Sciences 01-06-2016
Springer Nature B.V
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recurrent neural networks (RNN) have been very successful in handling sequence data. However, understanding RNN and finding the best practices for RNN learning is a difficult task, partly because there are many competing and complex hidden units, such as the long short-term memory (LSTM) and the gated recurrent unit (GRU). We propose a gated unit for RNN, named as minimal gated unit (MCU), since it only contains one gate, which is a minimal design among all gated hidden units. The design of MCU benefits from evaluation results on LSTM and GRU in the literature. Experiments on various sequence data show that MCU has comparable accuracy with GRU, but has a simpler structure, fewer parameters, and faster training. Hence, MGU is suitable in RNN's applications. Its simple architecture also means that it is easier to evaluate and tune, and in principle it is easier to study MGU's properties theoretically and empirically.
Bibliography:Recurrent neural networks (RNN) have been very successful in handling sequence data. However, understanding RNN and finding the best practices for RNN learning is a difficult task, partly because there are many competing and complex hidden units, such as the long short-term memory (LSTM) and the gated recurrent unit (GRU). We propose a gated unit for RNN, named as minimal gated unit (MCU), since it only contains one gate, which is a minimal design among all gated hidden units. The design of MCU benefits from evaluation results on LSTM and GRU in the literature. Experiments on various sequence data show that MCU has comparable accuracy with GRU, but has a simpler structure, fewer parameters, and faster training. Hence, MGU is suitable in RNN's applications. Its simple architecture also means that it is easier to evaluate and tune, and in principle it is easier to study MGU's properties theoretically and empirically.
11-5350/TP
Recurrent neural network, minimal gated unit (MGU), gated unit, gate recurrent unit (GRU), long short-term memory(LSTM), deep learning.
ISSN:1476-8186
1751-8520
DOI:10.1007/s11633-016-1006-2