Impact of Phase‐Change Memory Flicker Noise and Weight Drift on Analog Hardware Inference for Large‐Scale Deep Learning Networks
The analog AI core concept is appealing for deep‐learning (DL) because it combines computation and memory functions into a single device. Yet, significant challenges such as noise and weight drift will impact large‐scale analog in‐memory computing. Here, effects of flicker noise and drift on large D...
Saved in:
Published in: | Advanced intelligent systems Vol. 4; no. 5 |
---|---|
Main Authors: | , , , , , , , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Weinheim
John Wiley & Sons, Inc
01-05-2022
Wiley |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The analog AI core concept is appealing for deep‐learning (DL) because it combines computation and memory functions into a single device. Yet, significant challenges such as noise and weight drift will impact large‐scale analog in‐memory computing. Here, effects of flicker noise and drift on large DL systems are explored using a new flicker‐noise model with memory, which preserves temporal correlations, including a flicker noise figure of merit (FOM) Ar to quantify impacts on system performance. Flicker noise is characterized for Ge2Sb2Te5 (GST) based phase‐change memory (PCM) cells with a discovery of read‐noise asymmetry tied to shape asymmetry of mushroom cells. This experimental read polarity dependence is consistent with Pirovano's trap activation and defect annihilation model in an asymmetric GST cell. The impact of flicker noise and resistance drift of analog PCM synaptic devices on deep‐learning hardware is assessed for six large‐scale deep neural networks (DNNs) used for image classification, finding that the inference top‐1 accuracy degraded with the accumulated device flicker noise and drift as ∝Ar×twait, and ∝twait−ν, respectively, where ν is the drift coefficient. These negative impacts could be mitigated with a new hardware‐aware (HWA) (pre)‐training of the DNNs, which is applied before programming to the analog arrays.
Flicker noise with memory is included in simulations of large image classification neural networks. Figures of merit (FOM) are derived and compared with flicker noise and drift measurements on phase‐change memory (PCM) cells. A new noise asymmetry is found, correlated with the cell's structural asymmetry. New hardware training aware algorithms are explored to mitigate noise impacts. |
---|---|
ISSN: | 2640-4567 2640-4567 |
DOI: | 10.1002/aisy.202100179 |