Developing a Compressed Object Detection Model based on YOLOv4 for Deployment on Embedded GPU Platform of Autonomous System
Latest CNN-based object detection models are quite accurate but require a high-performance GPU to run in real-time. They still are heavy in terms of memory size and speed for an embedded system with limited memory space. Since the object detection for autonomous system is run on an embedded processo...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
01-08-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Latest CNN-based object detection models are quite accurate but require a
high-performance GPU to run in real-time. They still are heavy in terms of
memory size and speed for an embedded system with limited memory space. Since
the object detection for autonomous system is run on an embedded processor, it
is preferable to compress the detection network as light as possible while
preserving the detection accuracy. There are several popular lightweight
detection models but their accuracy is too low for safe driving applications.
Therefore, this paper proposes a new object detection model, referred as
YOffleNet, which is compressed at a high ratio while minimizing the accuracy
loss for real-time and safe driving application on an autonomous system. The
backbone network architecture is based on YOLOv4, but we could compress the
network greatly by replacing the high-calculation-load CSP DenseNet with the
lighter modules of ShuffleNet. Experiments with KITTI dataset showed that the
proposed YOffleNet is compressed by 4.7 times than the YOLOv4-s that could
achieve as fast as 46 FPS on an embedded GPU system(NVIDIA Jetson AGX Xavier).
Compared to the high compression ratio, the accuracy is reduced slightly to
85.8% mAP, that is only 2.6% lower than YOLOv4-s. Thus, the proposed network
showed a high potential to be deployed on the embedded system of the autonomous
system for the real-time and accurate object detection applications. |
---|---|
DOI: | 10.48550/arxiv.2108.00392 |