Invited: Accelerator design for deep learning training

Deep Neural Networks (DNNs) have emerged as a powerful and versatile set of techniques showing successes on challenging artificial intelligence (AI) problems. Applications in domains such as image/video processing, autonomous cars, natural language processing, speech synthesis and recognition, genom...

Full description

Saved in:

Bibliographic Details
Published in:	2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC) pp. 1 - 2
Main Authors:	Agrawal, Ankur, Chia-Yu Chen, Jungwook Choi, Gopalakrishnan, Kailash, Jinwook Oh, Shukla, Sunil, Srinivasan, Viji, Venkataramani, Swagath, Wei Zhang
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-06-2017
Subjects:	Computational modeling Data models Hardware Machine learning Neural networks System-on-chip Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep Neural Networks (DNNs) have emerged as a powerful and versatile set of techniques showing successes on challenging artificial intelligence (AI) problems. Applications in domains such as image/video processing, autonomous cars, natural language processing, speech synthesis and recognition, genomics and many others have embraced deep learning as the foundation. DNNs achieve superior accuracy for these applications with high computational complexity using very large models which require 100s of MBs of data storage, exaops of computation and high bandwidth for data movement. In spite of these impressive advances, it still takes days to weeks to train state of the art Deep Networks on large datasets - which directly limits the pace of innovation and adoption. In this paper, we present a multi-pronged approach to address the challenges in meeting both the throughput and the energy efficiency goals for DNN training.
DOI:	10.1145/3061639.3072944