Evaluation of Traditional and Deep Learning Human Detection Techniques Applied to Surveillance: A Performance Comparison at Distinct Object Sizes
Making computers capable of identifying and localizing people in images and videos is a topic that has been attracting the attention of many researchers in recent years. Several applications, including surveillance systems, can benefit from this capacity. There is no study that provides an unbiased...
Saved in:
Published in: | 2021 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) pp. 1 - 5 |
---|---|
Main Authors: | , , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
17-08-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Making computers capable of identifying and localizing people in images and videos is a topic that has been attracting the attention of many researchers in recent years. Several applications, including surveillance systems, can benefit from this capacity. There is no study that provides an unbiased comparison between the most representative types of methods, traditional and recent ones, focusing on human detection and specifically within a context of surveillance, where the size of the objects is mostly small relative to the image size. This paper aims to compare the performance of a set of representative human detection techniques applied to surveillance systems in terms of Average Precision (AP) and speed. Two main types of human detection methods are analyzed: the traditional ones, represented by HOG and Haar Cascades, and the deep learning ones, represented by Faster R-CNN, YOLO, and Mobile SSD. The comparison was performed using the VIRAT Ground Dataset, a large-scale real-world surveillance video dataset. When humans were small relative to the overall image size, the Haar Cascades method had an AP of 9.09, a value around six times higher than the other ones. The results indicate that detecting humans when they are far from the camera in videos and images is still a challenge to be overcome, and that, in some aspects, the traditional approaches outperform the recent deep learning ones. |
---|---|
DOI: | 10.1109/ICSPCC52875.2021.9564442 |