EfficientFace: an efficient deep network with feature enhancement for accurate face detection

In recent years, deep convolutional neural networks (CNN) have significantly advanced face detection. In particular, lightweight CNN-based architectures have achieved great success due to their low-complexity structure facilitating real-time detection tasks. However, current lightweight CNN-based fa...

Full description

Saved in:

Bibliographic Details
Published in:	Multimedia systems Vol. 29; no. 5; pp. 2825 - 2839
Main Authors:	Wang, Guangtao, Li, Jun, Wu, Zhijian, Xu, Jianhua, Shen, Jifeng, Yang, Wankou
Format:	Journal Article
Language:	English
Published:	Berlin/Heidelberg Springer Berlin Heidelberg 01-10-2023 Springer Nature B.V
Subjects:	Accuracy Artificial neural networks Aspect ratio Computer Communication Networks Computer Graphics Computer Science Cryptology Data Storage Representation Detectors Face recognition Lightweight Modules Multimedia Information Systems Occlusion Operating Systems Regular Paper Task complexity Feature enhancement Attention mechanism Face detection Cross-scale feature fusion Receptive Field Enhancement
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In recent years, deep convolutional neural networks (CNN) have significantly advanced face detection. In particular, lightweight CNN-based architectures have achieved great success due to their low-complexity structure facilitating real-time detection tasks. However, current lightweight CNN-based face detectors trading accuracy for efficiency have inadequate capability in handling insufficient feature representation, faces with unbalanced aspect ratios and occlusion. Consequently, they exhibit deteriorated performance far lagging behind the deep heavy detectors. To achieve efficient face detection without sacrificing accuracy, we design an efficient deep face detector termed EfficientFace in this study, which contains three modules for feature enhancement. To begin with, we design a novel cross-scale feature fusion strategy to facilitate bottom-up information propagation, such that fusing low-level and high-level features is further strengthened. Besides, this is conducive to estimating the locations of faces and enhancing the descriptive power of face features. Second, we introduce a Receptive Field Enhancement module to consider faces with various aspect ratios. Third, we add an Attention Mechanism module for improving the representational capability of occluded faces. We have evaluated EfficientFace on four public benchmarks and experimental results demonstrate the appealing performance of our method. In particular, our model respectively achieves 95.1% (Easy), 94.0% (Medium) and 90.1% (Hard) on a validation set of WIDER Face dataset, which is competitive with heavyweight models with only 1/15 computational costs of the state-of-the-art MogFace detector.
ISSN:	0942-4962 1432-1882
DOI:	10.1007/s00530-023-01134-6