Visual speech recognition with lightweight psychologically motivated gabor features
Zhang, Xuejie and Xu, Yan and Abel, Andrew K. and Smith, Leslie S. and Watt, Roger and Hussain, Amir and Gao, Chengxiang (2020) Visual speech recognition with lightweight psychologically motivated gabor features. Entropy, 22 (12). 1367. ISSN 1099-4300 (https://doi.org/10.3390/e22121367)
Preview |
Text.
Filename: Zhang_etal_Entropy_2022_Visual_speech_recognition_with_lightweight_psychologically.pdf
Final Published Version License: Download (2MB)| Preview |
Abstract
Extraction of relevant lip features is of continuing interest in the visual speech domain. Using end-to-end feature extraction can produce good results, but at the cost of the results being difficult for humans to comprehend and relate to. We present a new, lightweight feature extraction approach, motivated by human-centric glimpse-based psychological research into facial barcodes, and demonstrate that these simple, easy to extract 3D geometric features (produced using Gabor-based image patches), can successfully be used for speech recognition with LSTM-based machine learning. This approach can successfully extract low dimensionality lip parameters with a minimum of processing. One key difference between using these Gabor-based features and using other features such as traditional DCT, or the current fashion for CNN features is that these are human-centric features that can be visualised and analysed by humans. This means that it is easier to explain and visualise the results. They can also be used for reliable speech recognition, as demonstrated using the Grid corpus. Results for overlapping speakers using our lightweight system gave a recognition rate of over 82%, which compares well to less explainable features in the literature.
ORCID iDs
Zhang, Xuejie, Xu, Yan, Abel, Andrew K. ORCID: https://orcid.org/0000-0002-3631-8753, Smith, Leslie S., Watt, Roger, Hussain, Amir and Gao, Chengxiang;-
-
Item type: Article ID code: 86160 Dates: DateEvent3 December 2020Published23 November 2020AcceptedSubjects: Science > Mathematics > Electronic computers. Computer science > Other topics, A-Z > Human-computer interaction Department: Faculty of Science > Computer and Information Sciences Depositing user: Pure Administrator Date deposited: 18 Jul 2023 13:20 Last modified: 11 Nov 2024 14:00 URI: https://strathprints.strath.ac.uk/id/eprint/86160