Model-Distributed DNN Training for Memory-Constrained Edge Computing Devices

We consider a model-distributed learning framework in which layers of a deep learning model is distributed across multiple workers. To achieve consistent gradient updates during the training phase, model-distributed learning requires the storage of multiple versions of the layer parameters at every...

Full description

Saved in:

Bibliographic Details
Published in:	2021 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN) pp. 1 - 6
Main Authors:	Li, Pengzhen, Seferoglu, Hulya, Dasari, Venkat R., Koyuncu, Erdem
Format:	Conference Proceeding
Language:	English
Published:	IEEE 12-07-2021
Subjects:	Computational modeling deep neural networks (DNN) distributed training edge computing devices Learning systems memory Memory management Metropolitan area networks Neural networks Performance evaluation Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We consider a model-distributed learning framework in which layers of a deep learning model is distributed across multiple workers. To achieve consistent gradient updates during the training phase, model-distributed learning requires the storage of multiple versions of the layer parameters at every worker. In this paper, we design mcPipe to reduce the memory cost of model-distributed learning, which is crucial in memory-constrained edge computing devices. mcPipe uses an on-demand weight updating policy, which reduces the amount of weights that should be stored at workers. We analyze the memory cost of mcPipe and demonstrate its superior performance as compared to existing model-distributed learning mechanisms. We implement mcPipe in a real testbed and show that it improves the memory cost without hurting converge rate and computation cost.
ISSN:	1944-0375
DOI:	10.1109/LANMAN52105.2021.9478829