Modified Capsule Neural Network (Mod-CapsNet) for indoor home scene recognition

Basu, Amlan and Kaewrak, Keerati and Petropoulakis, Lykourgos and Di Caterina, Gaetano and Soraghan, John J.; (2020) Modified Capsule Neural Network (Mod-CapsNet) for indoor home scene recognition. In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, GBR. ISBN 9781728169279

[thumbnail of Basu-etal-WCCI2020-Modified-Capsule-Neural-Network-Mod-CapsNet-for-indoor-home-scene-recognition]
Text (Basu-etal-WCCI2020-Modified-Capsule-Neural-Network-Mod-CapsNet-for-indoor-home-scene-recognition)
Accepted Author Manuscript

Download (933kB)| Preview


    In this paper, a Modified Capsule Neural Network (Mod-CapsNet) with a pooling layer but without the squash function is used for recognition of indoor home scenes which are represented in grayscale. This Mod-CapsNet produced an accuracy of 70% compared to the 17.2% accuracy produced by a standard CapsNet. Since there is a lack of larger datasets related to indoor home scenes, to obtain better accuracy with smaller datasets is also one of the important aims in the paper. The number of images used for training and testing is 20,000 and 5000 respectively, all of dimension 128X128. The analysis proves that in the indoor home scene recognition task the combination of the capsule without a squash function and with max-pooling layers works better than by using capsules with convolutional layers. Indoor home scenes are specifically focused towards analysing capsules performance on datasets whose images have similarities but are, nonetheless, quite different. For example, tables may be present in living rooms and dining rooms even though these are quite different rooms.

    ORCID iDs

    Basu, Amlan ORCID logoORCID:, Kaewrak, Keerati, Petropoulakis, Lykourgos ORCID logoORCID:, Di Caterina, Gaetano ORCID logoORCID: and Soraghan, John J. ORCID logoORCID:;