Impact of Phase‐Change Memory Flicker Noise and Weight Drift on Analog Hardware Inference for Large‐Scale Deep Learning Networks

The analog AI core concept is appealing for deep‐learning (DL) because it combines computation and memory functions into a single device. Yet, significant challenges such as noise and weight drift will impact large‐scale analog in‐memory computing. Here, effects of flicker noise and drift on large D...

Full description

Saved in:
Bibliographic Details
Published in:Advanced intelligent systems Vol. 4; no. 5
Main Authors: Han, Jin-Ping, Rasch, Malte J., Liu, Zuoguang, Solomon, Paul, Brew, Kevin, Cheng, Kangguo, Ok, Injo, Chan, Victor, Longstreet, Michael, Kim, Wanki, Bruce, Robert L., Cheng, Cheng-Wei, Saulnier, Nicole, Narayanan, Vijay
Format: Journal Article
Language:English
Published: Weinheim John Wiley & Sons, Inc 01-05-2022
Wiley
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The analog AI core concept is appealing for deep‐learning (DL) because it combines computation and memory functions into a single device. Yet, significant challenges such as noise and weight drift will impact large‐scale analog in‐memory computing. Here, effects of flicker noise and drift on large DL systems are explored using a new flicker‐noise model with memory, which preserves temporal correlations, including a flicker noise figure of merit (FOM) Ar to quantify impacts on system performance. Flicker noise is characterized for Ge2Sb2Te5 (GST) based phase‐change memory (PCM) cells with a discovery of read‐noise asymmetry tied to shape asymmetry of mushroom cells. This experimental read polarity dependence is consistent with Pirovano's trap activation and defect annihilation model in an asymmetric GST cell. The impact of flicker noise and resistance drift of analog PCM synaptic devices on deep‐learning hardware is assessed for six large‐scale deep neural networks (DNNs) used for image classification, finding that the inference top‐1 accuracy degraded with the accumulated device flicker noise and drift as ∝Ar×twait, and ∝twait−ν, respectively, where ν is the drift coefficient. These negative impacts could be mitigated with a new hardware‐aware (HWA) (pre)‐training of the DNNs, which is applied before programming to the analog arrays. Flicker noise with memory is included in simulations of large image classification neural networks. Figures of merit (FOM) are derived and compared with flicker noise and drift measurements on phase‐change memory (PCM) cells. A new noise asymmetry is found, correlated with the cell's structural asymmetry. New hardware training aware algorithms are explored to mitigate noise impacts.
ISSN:2640-4567
2640-4567
DOI:10.1002/aisy.202100179