Model-Distributed DNN Training for Memory-Constrained Edge Computing Devices
We consider a model-distributed learning framework in which layers of a deep learning model is distributed across multiple workers. To achieve consistent gradient updates during the training phase, model-distributed learning requires the storage of multiple versions of the layer parameters at every...
Saved in:
Published in: | 2021 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN) pp. 1 - 6 |
---|---|
Main Authors: | , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
12-07-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We consider a model-distributed learning framework in which layers of a deep learning model is distributed across multiple workers. To achieve consistent gradient updates during the training phase, model-distributed learning requires the storage of multiple versions of the layer parameters at every worker. In this paper, we design mcPipe to reduce the memory cost of model-distributed learning, which is crucial in memory-constrained edge computing devices. mcPipe uses an on-demand weight updating policy, which reduces the amount of weights that should be stored at workers. We analyze the memory cost of mcPipe and demonstrate its superior performance as compared to existing model-distributed learning mechanisms. We implement mcPipe in a real testbed and show that it improves the memory cost without hurting converge rate and computation cost. |
---|---|
ISSN: | 1944-0375 |
DOI: | 10.1109/LANMAN52105.2021.9478829 |