A polynomial subspace projection approach for the detection of weak voice activity

Neo, Vincent W. and Weiss, Stephan and Naylor, Patrick A. (2022) A polynomial subspace projection approach for the detection of weak voice activity. In: 11th International Conference in Sensor Signal Processing for Defence, 2022-09-13 - 2022-09-14. (https://doi.org/10.1109/SSPD54131.2022.9896222)

[thumbnail of Neo-etal-SSPD2022-A-polynomial-subspace-projection-approach-for-the-detection-of-weak-voice-activity]
Text. Filename: Neo_etal_SSPD2022_A_polynomial_subspace_projection_approach_for_the_detection_of_weak_voice_activity.pdf
Accepted Author Manuscript
License: Strathprints license 1.0

Download (839kB)| Preview


A voice activity detection (VAD) algorithm identifies whether or not time frames contain speech. It is essential for many military and commercial speech processing applications, including speech enhancement, speech coding, speaker identification, and automatic speech recognition. In this work, we adopt earlier work on detecting weak transient signals and propose a polynomial subspace projection pre-processor to improve an existing VAD algorithm. The proposed multi-channel pre-processor projects the microphone signals onto a lower dimensional subspace which attempts to remove the interferer components and thus eases the detection of the speech target. Compared to applying the same VAD to the microphone signal, the proposed approach almost always improves the F1 and balanced accuracy scores even in adverse environments, e.g. -30 dB SIR, which may be typical of operations involving noisy machinery and signal jamming scenarios.