TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HRFormer: High-Resolution Transformer for Dense Prediction

HRFormer: High-Resolution Transformer for Dense Prediction

Yuhui Yuan, Rao Fu, Lang Huang, WeiHong Lin, Chao Zhang, Xilin Chen, Jingdong Wang

2021-10-18Image ClassificationVocal Bursts Intensity PredictionSemantic SegmentationPredictionPose EstimationMulti-Person Pose Estimation
PaperPDFCode(official)

Abstract

We present a High-Resolution Transformer (HRFormer) that learns high-resolution representations for dense prediction tasks, in contrast to the original Vision Transformer that produces low-resolution representations and has high memory and computational cost. We take advantage of the multi-resolution parallel design introduced in high-resolution convolutional networks (HRNet), along with local-window self-attention that performs self-attention over small non-overlapping image windows, for improving the memory and computation efficiency. In addition, we introduce a convolution into the FFN to exchange information across the disconnected image windows. We demonstrate the effectiveness of the High-Resolution Transformer on both human pose estimation and semantic segmentation tasks, e.g., HRFormer outperforms Swin transformer by $1.3$ AP on COCO pose estimation with $50\%$ fewer parameters and $30\%$ fewer FLOPs. Code is available at: https://github.com/HRNet/HRFormer.

Results

TaskDatasetMetricValueModel
Pose EstimationAICAP34.4HRFormer (HRFomer-B)
Pose EstimationAICAP5078.3HRFormer (HRFomer-B)
Pose EstimationAICAP7524.8HRFormer (HRFomer-B)
Pose EstimationAICAR38.7HRFormer (HRFomer-B)
Pose EstimationAICAR5080.9HRFormer (HRFomer-B)
Pose EstimationAICAP31.6HRFormer (HRFomer-S)
Pose EstimationAICAP7520.9HRFormer (HRFomer-S)
Pose EstimationAICAR35.8HRFormer (HRFomer-S)
Pose EstimationAICAR5078HRFormer (HRFomer-S)
Pose EstimationCOCO test-devAP76.2HRFormer-B
Pose EstimationCOCO test-devAP5092.7HRFormer-B
Pose EstimationCOCO test-devAP7583.8HRFormer-B
Pose EstimationCOCO test-devAPL82.3HRFormer-B
Pose EstimationCOCO test-devAPM72.5HRFormer-B
Pose EstimationCOCO test-devAR81.2HRFormer-B
Pose EstimationCrowdPoseAP Easy80HRFormer-B
Pose EstimationCrowdPoseAP Hard62.4HRFormer-B
Pose EstimationCrowdPoseAP Medium73.5HRFormer-B
Pose EstimationCrowdPosemAP @0.5:0.9572.4HRFormer-B
Pose EstimationOCHumanAP5081.4HRFormer-B
Pose EstimationOCHumanAP7567.1HRFormer-B
Pose EstimationOCHumanValidation AP62.1HRFormer-B
Image ClassificationImageNetGFLOPs13.7HRFormer-B
Image ClassificationImageNetGFLOPs1.8HRFormer-T
3DAICAP34.4HRFormer (HRFomer-B)
3DAICAP5078.3HRFormer (HRFomer-B)
3DAICAP7524.8HRFormer (HRFomer-B)
3DAICAR38.7HRFormer (HRFomer-B)
3DAICAR5080.9HRFormer (HRFomer-B)
3DAICAP31.6HRFormer (HRFomer-S)
3DAICAP7520.9HRFormer (HRFomer-S)
3DAICAR35.8HRFormer (HRFomer-S)
3DAICAR5078HRFormer (HRFomer-S)
3DCOCO test-devAP76.2HRFormer-B
3DCOCO test-devAP5092.7HRFormer-B
3DCOCO test-devAP7583.8HRFormer-B
3DCOCO test-devAPL82.3HRFormer-B
3DCOCO test-devAPM72.5HRFormer-B
3DCOCO test-devAR81.2HRFormer-B
3DCrowdPoseAP Easy80HRFormer-B
3DCrowdPoseAP Hard62.4HRFormer-B
3DCrowdPoseAP Medium73.5HRFormer-B
3DCrowdPosemAP @0.5:0.9572.4HRFormer-B
3DOCHumanAP5081.4HRFormer-B
3DOCHumanAP7567.1HRFormer-B
3DOCHumanValidation AP62.1HRFormer-B
Multi-Person Pose EstimationCrowdPoseAP Easy80HRFormer-B
Multi-Person Pose EstimationCrowdPoseAP Hard62.4HRFormer-B
Multi-Person Pose EstimationCrowdPoseAP Medium73.5HRFormer-B
Multi-Person Pose EstimationCrowdPosemAP @0.5:0.9572.4HRFormer-B
Multi-Person Pose EstimationOCHumanAP5081.4HRFormer-B
Multi-Person Pose EstimationOCHumanAP7567.1HRFormer-B
Multi-Person Pose EstimationOCHumanValidation AP62.1HRFormer-B
1 Image, 2*2 StitchiAICAP34.4HRFormer (HRFomer-B)
1 Image, 2*2 StitchiAICAP5078.3HRFormer (HRFomer-B)
1 Image, 2*2 StitchiAICAP7524.8HRFormer (HRFomer-B)
1 Image, 2*2 StitchiAICAR38.7HRFormer (HRFomer-B)
1 Image, 2*2 StitchiAICAR5080.9HRFormer (HRFomer-B)
1 Image, 2*2 StitchiAICAP31.6HRFormer (HRFomer-S)
1 Image, 2*2 StitchiAICAP7520.9HRFormer (HRFomer-S)
1 Image, 2*2 StitchiAICAR35.8HRFormer (HRFomer-S)
1 Image, 2*2 StitchiAICAR5078HRFormer (HRFomer-S)
1 Image, 2*2 StitchiCOCO test-devAP76.2HRFormer-B
1 Image, 2*2 StitchiCOCO test-devAP5092.7HRFormer-B
1 Image, 2*2 StitchiCOCO test-devAP7583.8HRFormer-B
1 Image, 2*2 StitchiCOCO test-devAPL82.3HRFormer-B
1 Image, 2*2 StitchiCOCO test-devAPM72.5HRFormer-B
1 Image, 2*2 StitchiCOCO test-devAR81.2HRFormer-B
1 Image, 2*2 StitchiCrowdPoseAP Easy80HRFormer-B
1 Image, 2*2 StitchiCrowdPoseAP Hard62.4HRFormer-B
1 Image, 2*2 StitchiCrowdPoseAP Medium73.5HRFormer-B
1 Image, 2*2 StitchiCrowdPosemAP @0.5:0.9572.4HRFormer-B
1 Image, 2*2 StitchiOCHumanAP5081.4HRFormer-B
1 Image, 2*2 StitchiOCHumanAP7567.1HRFormer-B
1 Image, 2*2 StitchiOCHumanValidation AP62.1HRFormer-B

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Multi-Strategy Improved Snake Optimizer Accelerated CNN-LSTM-Attention-Adaboost for Trajectory Prediction2025-07-21Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17