Partial Sum Quantization for Computing-In-Memory-Based Neural Network Accelerator

Computing-in-memory (CIM) has been successful as an ideal hardware platform to improve the performance and efficiency of convolutional neural networks (CNNs). However, owing to the limited size of a memory array, the input and weight matrices of a convolution operation have to be split into sub-matr...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on circuits and systems. II, Express briefs Vol. 70; no. 8; pp. 3049 - 3053
Main Authors: Bai, Jinyu, Xue, Wenlu, Fan, Yunqian, Sun, Sifan, Kang, Wang
Format: Journal Article
Language:English
Published: New York IEEE 01-08-2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Computing-in-memory (CIM) has been successful as an ideal hardware platform to improve the performance and efficiency of convolutional neural networks (CNNs). However, owing to the limited size of a memory array, the input and weight matrices of a convolution operation have to be split into sub-matrices, involving partial sums. Generally, high-resolution analog-to-digital converters (ADCs) are used to obtain partial sums for maintaining the computing precision, but at the cost of high area and energy. Partial sum quantization (PSQ), which can be exploited to significantly reduce the ADC's resolution, is still an open question in this field. This brief proposes a novel PSQ approach for CIM using post-training quantization based on a newly defined array-wise granularity. Meanwhile, as the non-linearity of ADCs' transfer function has a severe impact on the accuracy, a gradient estimation method based on smooth approximation is proposed to solve such a problem. Experiments on various CNNs show that the required ADCs' resolution can be reduced from 11-bit to even 3-bit with slight accuracy loss (~1.63%), and the energy-efficiency is increased by up to 224%.
ISSN:1549-7747
1558-3791
DOI:10.1109/TCSII.2023.3246562