Maximum Gaussianality training for deep speaker vector normalization
Cai, Yunqi and Li, Lantian and Abel, Andrew and Zhu, Xiaoyan and Wang, Dong (2024) Maximum Gaussianality training for deep speaker vector normalization. Pattern Recognition, 145. 109977. ISSN 0031-3203 (https://doi.org/10.1016/j.patcog.2023.109977)
Preview |
Text.
Filename: Cai_etal_PR_2023_Maximum_Gaussianality_training_for_deep_speaker_vector_normalization.pdf
Accepted Author Manuscript License: Download (1MB)| Preview |
Abstract
Automatic Speaker Verification (ASV) is a critical task in pattern recognition and has been applied to various security-sensitive scenarios. The current state-of-the-art technique for ASV is based on deep embedding. However, a significant challenge with this approach is that the resulting deep speaker vectors tend to be irregularly distributed. To address this issue, this paper proposes a novel training method called Maximum Gaussianality (MG), which regulates the distribution of the speaker vectors. Compared to the conventional normalization approach based on maximum likelihood (ML), the new approach directly maximizes the Gaussianality of the latent codes, and therefore can both normalize the between-class and within-class distributions in a controlled and reliable way and eliminate the unbound likelihood problem associated with the conventional ML approach. Our experiments on several datasets demonstrate that our MG-based normalization can deliver much better performance than the baseline systems without normalization and outperform discriminative normalization flow (DNF), an ML-based normalization method, particularly when the training data is limited. In theory, the MG criterion can be applied to any task in any research domain where Gaussian distributions are needed, making the MG training a versatile tool.
ORCID iDs
Cai, Yunqi, Li, Lantian, Abel, Andrew ORCID: https://orcid.org/0000-0002-3631-8753, Zhu, Xiaoyan and Wang, Dong;-
-
Item type: Article ID code: 86776 Dates: DateEvent31 January 2024Published18 September 2023Published Online13 September 2023AcceptedSubjects: Science > Mathematics > Electronic computers. Computer science > Other topics, A-Z Department: Faculty of Science > Computer and Information Sciences Depositing user: Pure Administrator Date deposited: 27 Sep 2023 11:32 Last modified: 11 Nov 2024 14:06 URI: https://strathprints.strath.ac.uk/id/eprint/86776