Indoor home scene recognition using capsule neural networks

Basu, Amlan and Petropoulakis, Lykourgos and Di Caterina, Gaetano and Soraghan, John (2019) Indoor home scene recognition using capsule neural networks. Procedia Computer Science. ISSN 1877-0509 (In Press)

[thumbnail of Basu-etal-PCS-2019-Indoor-home-scene-recognition-using-capsule-neural-networks]
Preview
Text. Filename: Basu_etal_PCS_2019_Indoor_home_scene_recognition_using_capsule_neural_networks.pdf
Final Published Version
License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 logo

Download (794kB)| Preview

Abstract

This paper presents the use of a class of Deep Neural Networks for recognizing indoor home scenes so as to aid Intelligent Assistive Systems (IAS) in performing indoor services to assist elderly or infirm people. Identifying exact indoor location is important so that objects associated with particular tasks can be located speedily and efficiently irrespective of position or orientation. In this way, IAS developed for providing services may become more efficient in accomplishing designated tasks satisfactorily. There are many Convolutional Neural Networks (CNNs) which have been developed for outdoor scene classification and, also, for interior (not necessarily indoor home) scene classification. However, to date, there are no CNNs which are trained, validated and tested on indoor home scene datasets as there appears to be an absence of sufficiently large databases of home scenes. Nonetheless, it is important to train systems which are meant to operate within home environments with the correct relevant data. To counteract this problem, it is proposed that a different type of network is used, which is not very deep (i.e., a network which does not have too many layers) but which can attain sufficiently high classification accuracy using smaller training datasets. A type of neural network likely to help achieve this is a Capsule Neural Network (CapsNet). In this paper, 20,000 indoor home scenes were used for training the CapsNet, and 5000 images were used for testing it. The validation accuracy achieved is 71% and testing accuracy achieved is 70%.