Convolutional Networks with Dense Connectivity
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional N...
Saved in:
Published in: | IEEE transactions on pattern analysis and machine intelligence Vol. 44; no. 12; pp. 8704 - 8716 |
---|---|
Main Authors: | , , , , |
Format: | Journal Article |
Language: | English |
Published: |
United States
IEEE
01-12-2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with <inline-formula><tex-math notation="LaTeX">L</tex-math> <mml:math><mml:mi>L</mml:mi></mml:math><inline-graphic xlink:href="huang-ieq1-2918284.gif"/> </inline-formula> layers have <inline-formula><tex-math notation="LaTeX">L</tex-math> <mml:math><mml:mi>L</mml:mi></mml:math><inline-graphic xlink:href="huang-ieq2-2918284.gif"/> </inline-formula> connections-one between each layer and its subsequent layer-our network has <inline-formula><tex-math notation="LaTeX">\frac{L(L+1)}{2}</tex-math> <mml:math><mml:mfrac><mml:mrow><mml:mi>L</mml:mi><mml:mo>(</mml:mo><mml:mi>L</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:mfrac></mml:math><inline-graphic xlink:href="huang-ieq3-2918284.gif"/> </inline-formula> direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, encourage feature reuse and substantially improve parameter efficiency. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less parameters and computation to achieve high performance. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0162-8828 1939-3539 2160-9292 |
DOI: | 10.1109/TPAMI.2019.2918284 |