Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement

Yan, Yijun and Ren, Jinchang and Sun, Genyun and Zhao, Huimin and Han, Junwei and Li, Xuelong and Marshall, Stephen and Zhan, Jin (2018) Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement. Pattern Recognition, 79. pp. 65-78. ISSN 0031-3203 (https://doi.org/10.1016/j.patcog.2018.02.004)

[thumbnail of Yan-etal-PR-2018-Unsupervised-image-saliency-detection-with-Gestalt-laws-guided]
Preview
Text. Filename: Yan_etal_PR_2018_Unsupervised_image_saliency_detection_with_Gestalt_laws_guided.pdf
Accepted Author Manuscript
License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 logo

Download (1MB)| Preview

Abstract

Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.