Sparsification of Deep Neural Networks via Ternary Quantization

In recent years, the demand for compact deep neural networks (DNN s) has increased consistently, driven by the necessity to deploy them in environments with limited resources such as mobile or embedded devices. Our work aims to tackle this challenge by proposing a combination of two techniques: spar...

Full description

Saved in:
Bibliographic Details
Published in:2024 IEEE 34th International Workshop on Machine Learning for Signal Processing (MLSP) pp. 1 - 6
Main Authors: Dordoni, Luca, Migliorati, Andrea, Fracastoro, Giulia, Fosson, Sophie, Bianchi, Tiziano, Magli, Enrico
Format: Conference Proceeding
Language:English
Published: IEEE 22-09-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, the demand for compact deep neural networks (DNN s) has increased consistently, driven by the necessity to deploy them in environments with limited resources such as mobile or embedded devices. Our work aims to tackle this challenge by proposing a combination of two techniques: sparsification and t ernarization o f network parameters. We extend the plain binarization by introducing a sparsification interval centered around O. The network parameters falling in this interval are set to 0 and effectively removed from the net-work topology. Despite the increased complexity required by the ternarization scheme compared to a binary quantizer, we obtain remarkable sparsity rates that yield parameter distri-butions with significantly compressible sources with entropy lower than 1 bits/symbol.
ISSN:2161-0371
DOI:10.1109/MLSP58920.2024.10734714