Stripes: Bit-serial deep neural network computing

Motivated by the variance in the numerical precision requirements of Deep Neural Networks (DNNs) [1], [2], Stripes (STR), a hardware accelerator is presented whose execution time scales almost proportionally with the length of the numerical representation used. STR relies on bit-serial compute units...

Full description

Saved in:

Bibliographic Details
Published in:	2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) pp. 1 - 12
Main Authors:	Judd, Patrick, Albericio, Jorge, Hetherington, Tayler, Aamodt, Tor M., Moshovos, Andreas
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-10-2016
Subjects:	Bandwidth Computers Neural networks Neurons Parallel processing Performance evaluation Three-dimensional displays
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Motivated by the variance in the numerical precision requirements of Deep Neural Networks (DNNs) [1], [2], Stripes (STR), a hardware accelerator is presented whose execution time scales almost proportionally with the length of the numerical representation used. STR relies on bit-serial compute units and on the parallelism that is naturally present within DNNs to improve performance and energy with no accuracy loss. In addition, STR provides a new degree of adaptivity enabling on-the-fly trade-offs among accuracy, performance, and energy. Experimental measurements over a set of DNNs for image classification show that STR improves performance over a state-of-the-art accelerator [3] from 1.30x to 4.51x and by 1.92x on average with no accuracy loss. STR is 57% more energy efficient than the baseline at a cost of 32% additional area. Additionally, by enabling configurable, per-layer and per-bit precision control, STR allows the user to trade accuracy for further speedup and energy efficiency.
DOI:	10.1109/MICRO.2016.7783722