A new cognitive temporal-spatial visual attention model for video saliency detection

Yang, Erfu and Tu, Zhengzheng and Luo, Bin and Hussain, Amir (2015) A new cognitive temporal-spatial visual attention model for video saliency detection. In: Sixth China-Scotland SIPRA Workshop on Recent Advances in Signal and Image Processing, 2015-05-31 - 2015-06-01, University of Stirling. (Unpublished)

Full text not available in this repository.Request a copy from the Strathclyde author


Human vision has the natural cognitive ability to focus on salient objects or areas when watching static or dynamic scenes. Whilst research in image saliency has been historically popular, the challenging area of video saliency has been gaining increasing interest recently, as autonomous and cognitive vision techniques have continued to develop greatly. In this talk, a new cognitive temporal-spatial visual attention model is presented for video saliency detection. It extends the popular graph-based visual saliency(GBVS) model which adopts a ‘bottom-up’ visual attention mechanism. The new model can detect salient motion map which can be combined with other static feature maps in GBVS model. Our proposed model is inspired, firstly, by the observation that independent components of optical flows are recognized for motion understanding in human brains, in the light of which we employ robust independent component analysis (robust ICA) to separate salient foreground optical flows from relatively static background. A second key feature of our proposed model is that the motion saliency map is calculated based on the foreground optical flow vector field and mean shift segmentation. Finally, the salient motion map is normalized and then fused with static maps through a linear combination. Preliminary experiments demonstrate that the spatio-temporal saliency map detected by the new cognitive visual attention model highlights salient foreground moving objects effectively, even in a complex outdoor scene with dynamic background or bad weather. The proposed model could be further exploited for autonomous robotic applications. Acknowledgements: This research is supported by The Royal Society of Edinburgh (RSE) and The National Natural Science Foundation of China (NNSFC) under the RSE-NNSFC joint project (2012-2014) [grant number 61211130309] with Anhui University, China, and the “Sino-UK Higher Education Research Partnership for PhD Studies” joint-project (2013-2015) funded by the British Council China and The China Scholarship Council (CSC). Amir and Erfu Yang are also funded, in part, by the UK Engineering and Physical Sciences Research Council (EPSRC) [grant number EP/I009310/1], and the RSE-NNSFC joint project (2012-2014) [grant number 61211130210] with Beihang University, China.


Yang, Erfu ORCID logoORCID: https://orcid.org/0000-0003-1813-5950, Tu, Zhengzheng, Luo, Bin and Hussain, Amir;