Convolutional Networks with Dense Connectivity

Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional N...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on pattern analysis and machine intelligence Vol. 44; no. 12; pp. 8704 - 8716
Main Authors:	Huang, Gao, Liu, Zhuang, Pleiss, Geoff, Maaten, Laurens van der, Weinberger, Kilian Q.
Format:	Journal Article
Language:	English
Published:	United States IEEE 01-12-2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Benchmark testing Computer architecture Convolution Convolutional neural network deep learning image classification Network architecture Networks Neural networks Object recognition Parameters Task analysis Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with <inline-formula><tex-math notation="LaTeX">L</tex-math> <mml:math><mml:mi>L</mml:mi></mml:math><inline-graphic xlink:href="huang-ieq1-2918284.gif"/> </inline-formula> layers have <inline-formula><tex-math notation="LaTeX">L</tex-math> <mml:math><mml:mi>L</mml:mi></mml:math><inline-graphic xlink:href="huang-ieq2-2918284.gif"/> </inline-formula> connections-one between each layer and its subsequent layer-our network has <inline-formula><tex-math notation="LaTeX">\frac{L(L+1)}{2}</tex-math> <mml:math><mml:mfrac><mml:mrow><mml:mi>L</mml:mi><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:mfrac></mml:math><inline-graphic xlink:href="huang-ieq3-2918284.gif"/> </inline-formula> direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, encourage feature reuse and substantially improve parameter efficiency. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less parameters and computation to achieve high performance.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2019.2918284