TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/FaPN: Feature-aligned Pyramid Network for Dense Image Pred...

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

Shihua Huang, Zhichao Lu, Ran Cheng, Cheng He

2021-08-16ICCV 2021 10feature selectionSemantic SegmentationPrediction
PaperPDFCodeCode(official)Code

Abstract

Recent advancements in deep neural networks have made remarkable leap-forwards in dense image prediction. However, the issue of feature alignment remains as neglected by most existing approaches for simplicity. Direct pixel addition between upsampled and local features leads to feature maps with misaligned contexts that, in turn, translate to mis-classifications in prediction, especially on object boundaries. In this paper, we propose a feature alignment module that learns transformation offsets of pixels to contextually align upsampled higher-level features; and another feature selection module to emphasize the lower-level features with rich spatial details. We then integrate these two modules in a top-down pyramidal architecture and present the Feature-aligned Pyramid Network (FaPN). Extensive experimental evaluations on four dense prediction tasks and four datasets have demonstrated the efficacy of FaPN, yielding an overall improvement of 1.2 - 2.6 points in AP / mIoU over FPN when paired with Faster / Mask R-CNN. In particular, our FaPN achieves the state-of-the-art of 56.7% mIoU on ADE20K when integrated within Mask-Former. The code is available from https://github.com/EMI-Group/FaPN.

Results

TaskDatasetMetricValueModel
Semantic SegmentationADE20K valmIoU56.7FaPN (MaskFormer, Swin-L, ImageNet-22k pretrain)
Semantic SegmentationADE20KValidation mIoU56.7FaPN (MaskFormer, Swin-L, ImageNet-22k pretrain)
10-shot image generationADE20K valmIoU56.7FaPN (MaskFormer, Swin-L, ImageNet-22k pretrain)
10-shot image generationADE20KValidation mIoU56.7FaPN (MaskFormer, Swin-L, ImageNet-22k pretrain)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Multi-Strategy Improved Snake Optimizer Accelerated CNN-LSTM-Attention-Adaboost for Trajectory Prediction2025-07-21mNARX+: A surrogate model for complex dynamical systems using manifold-NARX and automatic feature selection2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16