TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Shuffle Transformer: Rethinking Spatial Shuffle for Vision...

Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer

Zilong Huang, Youcheng Ben, Guozhong Luo, Pei Cheng, Gang Yu, Bin Fu

2021-06-07Image ClassificationSegmentationSemantic Segmentationobject-detectionObject Detection
PaperPDFCodeCodeCodeCode

Abstract

Very recently, Window-based Transformers, which computed self-attention within non-overlapping local windows, demonstrated promising results on image classification, semantic segmentation, and object detection. However, less study has been devoted to the cross-window connection which is the key element to improve the representation ability. In this work, we revisit the spatial shuffle as an efficient way to build connections among windows. As a result, we propose a new vision transformer, named Shuffle Transformer, which is highly efficient and easy to implement by modifying two lines of code. Furthermore, the depth-wise convolution is introduced to complement the spatial shuffle for enhancing neighbor-window connections. The proposed architectures achieve excellent performance on a wide range of visual tasks including image-level classification, object detection, and semantic segmentation. Code will be released for reproduction.

Results

TaskDatasetMetricValueModel
Semantic SegmentationADE20K valmIoU50.5UperNet Shuffle-B
Semantic SegmentationADE20K valmIoU49.6UperNet Shuffle-S
Semantic SegmentationADE20K valmIoU47.6UperNet Shuffle-T
Semantic SegmentationADE20KValidation mIoU50.5UperNet Shuffle-B
Semantic SegmentationADE20KValidation mIoU47.6UperNet Shuffle-T
10-shot image generationADE20K valmIoU50.5UperNet Shuffle-B
10-shot image generationADE20K valmIoU49.6UperNet Shuffle-S
10-shot image generationADE20K valmIoU47.6UperNet Shuffle-T
10-shot image generationADE20KValidation mIoU50.5UperNet Shuffle-B
10-shot image generationADE20KValidation mIoU47.6UperNet Shuffle-T

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17