Deep learning acceleration at the resource-constrained tactical edge

This paper outlines how we modified the torch2trt library which allowed us to build a recursive framework that can quantize previously unsupported PyTorch models. The framework partitions the PyTorch model into supported and unsupported modules, and then rebuilds the PyTorch model by replacing the s...

Full description

Saved in:

Bibliographic Details
Published in:	2023 IEEE International Conference on Big Data (BigData) pp. 3857 - 3862
Main Authors:	Geerhart, Billy, Dasari, Venkat R., Wang, Peng, Rapp, Brian
Format:	Conference Proceeding
Language:	English
Published:	IEEE 15-12-2023
Subjects:	Artificial neural networks Computational efficiency Computational modeling computer vision deep neural networks inference acceleration Libraries Memory management model reduction Partitioning algorithms PyTorch quantization Quantization (signal)
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper outlines how we modified the torch2trt library which allowed us to build a recursive framework that can quantize previously unsupported PyTorch models. The framework partitions the PyTorch model into supported and unsupported modules, and then rebuilds the PyTorch model by replacing the supported PyTorch modules with faster TensorRT modules. The framework allows us to optimize and deploy more advanced Deep Neural Network algorithms that are not natively supported by torch2trt.
DOI:	10.1109/BigData59044.2023.10386886