Polynomial eigenvalue decomposition-based target speaker voice activity detection in the presence of competing talkers
Neo, Vincent W. and Weiss, Stephan and McKnight, Simon W. and Hogg, Aidan O. T. and Naylor, Patrick A. (2022) Polynomial eigenvalue decomposition-based target speaker voice activity detection in the presence of competing talkers. In: 17th International Workshop on Acoustic Signal Enhancement, 2022-09-05 - 2022-09-08. (https://doi.org/10.1109/IWAENC53105.2022.9914796)
Preview |
Text.
Filename: Neo_etal_IWAENC2022_Polynomial_eigenvalue_decomposition_based_target_speaker_voice_activity_detection.pdf
Accepted Author Manuscript License: Strathprints license 1.0 Download (855kB)| Preview |
Abstract
Voice activity detection (VAD) algorithms are essential for many speech processing applications, such as speaker diarization, automatic speech recognition, speech enhancement, and speech coding. With a good VAD algorithm, non-speech segments can be excluded to improve the performance and computation of these applications. In this paper, we propose a polynomial eigenvalue decomposition-based target-speaker VAD algorithm to detect unseen target speakers in the presence of competing talkers. The proposed approach uses frame-based processing across multi-microphones to compute the syndrome energy, used for testing the presence or absence of a target speaker. The proposed approach is consistently among the best in F1 and balanced accuracy scores over the investigated range of signal to interference ratio (SIR) from -10 dB to 20 dB.
ORCID iDs
Neo, Vincent W., Weiss, Stephan ORCID: https://orcid.org/0000-0002-3486-7206, McKnight, Simon W., Hogg, Aidan O. T. and Naylor, Patrick A.;-
-
Item type: Conference or Workshop Item(Paper) ID code: 81420 Dates: DateEvent17 October 2022Published8 September 2022Published Online1 July 2022AcceptedNotes: © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Subjects: Technology > Electrical engineering. Electronics Nuclear engineering Department: Faculty of Engineering > Electronic and Electrical Engineering
Technology and Innovation Centre > Sensors and Asset ManagementDepositing user: Pure Administrator Date deposited: 08 Jul 2022 12:55 Last modified: 27 Nov 2024 18:11 Related URLs: URI: https://strathprints.strath.ac.uk/id/eprint/81420