Integration transformer for ground-based cloud image segmentation
Liu, Shuang and Zhang, Jiafeng and Zhang, Zhong and Cao, Xiaozhong and Durrani, Tariq S. (2023) Integration transformer for ground-based cloud image segmentation. IEEE Transactions on Geoscience and Remote Sensing, 61. 5606712. ISSN 0196-2892 (https://doi.org/10.1109/tgrs.2023.3265384)
Preview |
Text.
Filename: Liu_etal_IEEE_2023_Integration_transformer_fr_ground_based_cloud.pdf
Accepted Author Manuscript License: Strathprints license 1.0 Download (15MB)| Preview |
Abstract
Recently, convolutional neural networks (CNNs) dominate the ground-based cloud image segmentation task, but disregard the learning of long-range dependencies due to the limited size of filters. Although Transformer-based methods could overcome this limitation, they only learn long-range dependencies at a single scale, hence failing to capture multiscale information of cloud images. The multiscale information is beneficial to ground-based cloud image segmentation, because the features from small scales tend to extract detailed information, while features from large scales have the ability to learn global information. In this article, we propose a novel deep network named Integration Transformer (InTransformer), which builds long-range dependencies from different scales. To this end, we propose the hybrid multihead transformer block (HMTB) to learn multiscale long-range dependencies and hybridize CNNs and HMTB as the encoder at different scales. The proposed InTransformer hybridizes CNNs and Transformer as the encoder to extract multiscale representations, which learns both local information and long-range dependencies with different scales. Meanwhile, in order to fuse the patch tokens with different scales, we propose a mutual cross-attention module (MCAM) for the decoder of InTransformer which could adequately interact multiscale patch tokens in a bidirectional way. We have conducted a series of experiments on the large ground-based cloud detection database TJNU Large Scale Cloud Detection Database (TLCDD) and Singapore Whole sky IMaging SEGmentation Database (SWIMSEG). The experimental results show that the performance of our method outperforms other methods, proving the effectiveness of the proposed InTransformer.
-
-
Item type: Article ID code: 85278 Dates: DateEvent20 April 2023Published7 April 2023Published Online14 March 2023Accepted7 February 2023SubmittedSubjects: Technology > Electrical engineering. Electronics Nuclear engineering > Electrical apparatus and materials > Electric networks Department: Faculty of Engineering > Electronic and Electrical Engineering Depositing user: Pure Administrator Date deposited: 25 Apr 2023 13:03 Last modified: 11 Nov 2024 13:55 URI: https://strathprints.strath.ac.uk/id/eprint/85278