Stripes: Bit-serial deep neural network computing

Motivated by the variance in the numerical precision requirements of Deep Neural Networks (DNNs) [1], [2], Stripes (STR), a hardware accelerator is presented whose execution time scales almost proportionally with the length of the numerical representation used. STR relies on bit-serial compute units...

Full description

Saved in:
Bibliographic Details
Published in:2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) pp. 1 - 12
Main Authors: Judd, Patrick, Albericio, Jorge, Hetherington, Tayler, Aamodt, Tor M., Moshovos, Andreas
Format: Conference Proceeding
Language:English
Published: IEEE 01-10-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Motivated by the variance in the numerical precision requirements of Deep Neural Networks (DNNs) [1], [2], Stripes (STR), a hardware accelerator is presented whose execution time scales almost proportionally with the length of the numerical representation used. STR relies on bit-serial compute units and on the parallelism that is naturally present within DNNs to improve performance and energy with no accuracy loss. In addition, STR provides a new degree of adaptivity enabling on-the-fly trade-offs among accuracy, performance, and energy. Experimental measurements over a set of DNNs for image classification show that STR improves performance over a state-of-the-art accelerator [3] from 1.30x to 4.51x and by 1.92x on average with no accuracy loss. STR is 57% more energy efficient than the baseline at a cost of 32% additional area. Additionally, by enabling configurable, per-layer and per-bit precision control, STR allows the user to trade accuracy for further speedup and energy efficiency.
DOI:10.1109/MICRO.2016.7783722