Triple loss for hard face detection

Fang, Zhenyu and Ren, Jinchang and Marshall, Stephen and Zhao, Huimin and Wang, Zheng and Huang, Kaizhu and Xiao, Bing (2020) Triple loss for hard face detection. Neurocomputing, 398. pp. 20-30. ISSN 0925-2312

[img] Text (Fang-etal-Neurocomputing-2020-Triple-loss-for-hard-face-detection)
Fang_etal_Neurocomputing_2020_Triple_loss_for_hard_face_detection.pdf
Accepted Author Manuscript
Restricted to Repository staff only until 21 February 2021.
License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 logo

Download (2MB) | Request a copy from the Strathclyde author

    Abstract

    Although face detection has been well addressed in the last decades, despite the achievements in recent years, effective detection of small, blurred and partially occluded faces in the wild remains a challenging task. Meanwhile, the trade-off between computational cost and accuracy is also an open research problem in this context. To tackle these challenges, in this paper, a novel context enhanced approach is proposed with structural optimization and loss function optimization. For loss function optimization, we introduce a hierarchical loss, referring to ``triple loss'' in this paper, to optimize the feature pyramid network (FPN) (Lin et al., 2017) based face detector. Additional layers are only applied during the training process. As a result, the computational cost is the same as FPN during inference. For structural optimization, we propose a context sensitive structure to increase the capacity of the prediction network to improve the accuracy of the output. In details, a three-branch inception subnet (Szegedy et al., 2015) based feature fusion module is employed to refine the original FPN without increasing the computational cost significantly, further improving low-level semantic information, which is originally extracted from a single convolutional layer in the backward pathway of FPN. The proposed approach is evaluated on two publicly available face detection benchmarks, FDDB and WIDER FACE. By using a VGG-16 based detector, experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.