A Review of Deep Learning-Based Visual Multi-Object Tracking Algorithms for Autonomous Driving

Multi-target tracking, a high-level vision job in computer vision, is crucial to understanding autonomous driving surroundings. Numerous top-notch multi-object tracking algorithms have evolved in recent years as a result of deep learning’s outstanding performance in the field of visual object tracki...

Full description

Saved in:

Bibliographic Details
Published in:	Applied sciences Vol. 12; no. 21; p. 10741
Main Authors:	Guo, Shuman, Wang, Shichang, Yang, Zhenzhong, Wang, Lijun, Zhang, Huawei, Guo, Pengyan, Gao, Yuguo, Guo, Junkai
Format:	Journal Article
Language:	English
Published:	Basel MDPI AG 01-11-2022
Subjects:	Algorithms autonomous driving Cameras Computer vision Deep learning Systematic review transformer Visual discrimination learning Visual fields visual multi-object tracking
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Multi-target tracking, a high-level vision job in computer vision, is crucial to understanding autonomous driving surroundings. Numerous top-notch multi-object tracking algorithms have evolved in recent years as a result of deep learning’s outstanding performance in the field of visual object tracking. There have been a number of evaluations on individual sub-problems, but none that cover the challenges, datasets, and algorithms associated with visual multi-object tracking in autonomous driving scenarios. In this research, we present an exhaustive study of algorithms in the field of visual multi-object tracking over the last ten years, based on a systematic review approach. The algorithm is broken down into three groups based on its structure: methods for tracking by detection (TBD), joint detection and tracking (JDT), and Transformer-based tracking. The research reveals that the TBD algorithm has a straightforward structure, however the correlation between its individual sub-modules is not very strong. To track multiple objects, the JDT technique combines multi-module joint learning with a deep network framework. Transformer-based algorithms have been explored over the past two years, and they have benefits in numerous assessment indicators, as well as tremendous research potential in the area of multi-object tracking. Theoretical support for algorithmic research in adjacent disciplines is provided by this paper. Additionally, the approach we discuss, which uses merely monocular cameras rather than sophisticated sensor fusion, is anticipated to pave the way for the quick creation of safe and affordable autonomous driving systems.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app122110741