Towards trustworthy rotating machinery fault diagnosis via attention uncertainty in transformer

Xiao, Yiming and Shao, Haidong and Feng, Minjie and Han, Te and Wan, Jiafu and Liu, Bin (2023) Towards trustworthy rotating machinery fault diagnosis via attention uncertainty in transformer. Journal of Manufacturing Systems, 70. pp. 186-201. ISSN 0278-6125 (

[thumbnail of Xiao-etal-JMS-2023-Towards-trustworthy-rotating-machinery-fault-diagnosis-via-attention-uncertainty-in-transformer] Text. Filename: Xiao_etal_JMS_2023_Towards_trustworthy_rotating_machinery_fault_diagnosis_via_attention_uncertainty_in_transformer.pdf
Accepted Author Manuscript
Restricted to Repository staff only until 29 July 2024.
License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 logo

Download (3MB) | Request a copy


To enable researchers to fully trust the decisions made by deep diagnostic models, interpretable rotating machinery fault diagnosis (RMFD) research has emerged. Existing interpretable RMFD research focuses on developing interpretable modules embedded in deep models to assign physical meaning to results, or on inferring the logic of the model to make decisions based on results. However, there is limited work on how to quantify uncertainty in results and explain its sources and composition. Uncertainty quantification and decomposition not only provide the confidence of the results, but also identify the source of unknown factors in the data, and consequently guide to enhance the interpretability and trustworthiness of models. Therefore, this paper proposes to use Bayesian variational learning to introduce uncertainty into the attention weights of Transformer to construct a probabilistic Bayesian Transformer for trustworthy RMFD. A probabilistic attention is designed and the corresponding optimization objective is defined, which can infer the prior and variational posterior distributions of attention weights, thus empowering the model to perceive uncertainty. An uncertainty quantification and decomposition scheme is developed to achieve confidence characterization of results and separation of epistemic and aleatoric uncertainty. The effectiveness of the proposed method is fully verified in three out-of-distribution generalization scenarios.