Decoupling GPGPU voltage-frequency scaling for deep-learning applications

•GPUs may be safely undervoltage, allowing for non-conventional DVFS configurations.•A benchmark suit characterizes GPU components regarding undervoltage limitations.•ALU and DRAM-Cache controller are the most sensitive components to voltage drops.•High energy efficiency gains are obtained with deco...

Full description

Saved in:
Bibliographic Details
Published in:Journal of parallel and distributed computing Vol. 165; pp. 32 - 51
Main Authors: Mendes, Francisco, Tomás, Pedro, Roma, Nuno
Format: Journal Article
Language:English
Published: Elsevier Inc 01-07-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•GPUs may be safely undervoltage, allowing for non-conventional DVFS configurations.•A benchmark suit characterizes GPU components regarding undervoltage limitations.•ALU and DRAM-Cache controller are the most sensitive components to voltage drops.•High energy efficiency gains are obtained with decoupled frequency-voltage pairs.•Accuracy of Deep Neural Networks (DNN) applications is not compromised. The use of GPUs to accelerate DNN training and inference is already widely adopted, allowing for a significant performance increase. However, this performance is usually obtained at the cost of a consequent increase in energy consumption. While several solutions have been proposed to perform voltage-frequency scaling on GPUs, these are still one-dimensional, by simply adjusting the frequency while relying on default voltage settings. To overcome this limitation, this paper introduces a new methodology to fully characterize the impact of non-conventional DVFS on GPUs. The proposed approach was evaluated on two devices, an AMD Vega 10 Frontier Edition and an AMD Radeon 5700XT. When applying this non-conventional DVFS scheme to DNN training, the obtained results show that it is possible to safely decrease the GPU voltage, allowing for a significant reduction of the energy consumption (up to 38%) and of the EDP (up to 41%) on the training procedure of CNN models, with no degradation of the networks accuracy.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2022.03.004