Discriminative Layer Pruning for Convolutional Neural Networks

The predictive ability of convolutional neural networks (CNNs) can be improved by increasing their depth. However, increasing depth also increases computational cost significantly, in terms of both floating point operations and memory consumption, hindering applicability on resource-constrained syst...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE journal of selected topics in signal processing Vol. 14; no. 4; pp. 828 - 837
Main Authors:	Jordao, Artur, Lie, Maiko, Schwartz, William Robson
Format:	Journal Article
Language:	English
Published:	New York IEEE 01-05-2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Artificial neural networks Computational efficiency Computer architecture Constraints Convolutional neural networks Efficiency Electronic devices Estimation Floating point arithmetic Forecasting Internet of Things Network compression network pruning Neural networks Parameters Pruning Visualization
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The predictive ability of convolutional neural networks (CNNs) can be improved by increasing their depth. However, increasing depth also increases computational cost significantly, in terms of both floating point operations and memory consumption, hindering applicability on resource-constrained systems such as mobile and internet of things (IoT) devices. Fortunately, most networks have spare capacity, that is, they require fewer parameters than they actually have to perform accurately. This motivates network compression methods, which remove or quantize parameters to improve resource-efficiency. In this work, we consider a straightforward strategy for removing entire convolutional layers to reduce network depth. Since it focuses on depth, this approach not only reduces memory usage, but also reduces prediction time significantly by mitigating the serialization overhead incurred by forwarding through consecutive layers. We show that a simple subspace projection approach can be employed to estimate the importance of network layers, enabling the pruning of CNNs to a resource-efficient depth within a given network size constraint. We estimate importance on a subspace computed using Partial Least Squares , a feature projection approach that preserves discriminative information. Consequently, this importance estimation is correlated to the contribution of the layer to the classification ability of the model. We show that cascading discriminative layer pruning with filter-oriented pruning improves the resource-efficiency of the resulting network compared to using any of them alone, and that it outperforms state-of-the-art methods. Moreover, we show that discriminative layer pruning alone, without cascading, achieves competitive resource-efficiency compared to methods that prune filters from all layers.
ISSN:	1932-4553 1941-0484
DOI:	10.1109/JSTSP.2020.2975987