iNALU: Improved Neural Arithmetic Logic Unit

Neural networks have to capture mathematical relationships in order to learn various tasks. They approximate these relations implicitly and therefore often do not generalize well. The recently proposed Neural Arithmetic Logic Unit (NALU) is a novel neural architecture which is able to explicitly rep...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in artificial intelligence Vol. 3; p. 71
Main Authors: Schlör, Daniel, Ring, Markus, Hotho, Andreas
Format: Journal Article
Language:English
Published: Switzerland Frontiers Media S.A 29-09-2020
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Neural networks have to capture mathematical relationships in order to learn various tasks. They approximate these relations implicitly and therefore often do not generalize well. The recently proposed Neural Arithmetic Logic Unit (NALU) is a novel neural architecture which is able to explicitly represent the mathematical relationships by the units of the network to learn operations such as summation, subtraction or multiplication. Although NALUs have been shown to perform well on various downstream tasks, an in-depth analysis reveals practical shortcomings by design, such as the inability to multiply or divide negative input values or training stability issues for deeper networks. We address these issues and propose an improved model architecture. We evaluate our model empirically in various settings from learning basic arithmetic operations to more complex functions. Our experiments indicate that our model solves stability issues and outperforms the original NALU model in means of arithmetic precision and convergence.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by: Devendra Singh Dhami, The University of Texas at Dallas, United States
Reviewed by: Mayukh Das, Samsung, India; Alejandro Molina, Darmstadt University of Technology, Germany
This article was submitted to Machine Learning and Artificial Intelligence, a section of the journal Frontiers in Artificial Intelligence
ISSN:2624-8212
2624-8212
DOI:10.3389/frai.2020.00071