Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Mix-FFN

Mix-FFN

GeneralIntroduced 200047 papers

Description

Mix-FFN is a feedforward layer used in the SegFormer architecture. ViT uses positional encoding (PE) to introduce the location information. However, the resolution of $\mathrm{PE}$ is fixed. Therefore, when the test resolution is different from the training one, the positional code needs to be interpolated and this often leads to dropped accuracy. To alleviate this problem, CPVT uses $\mathbf{x}\_{\text {out }}=\operatorname{MLP}\left(\operatorname{GELU}\left(\operatorname{Conv}\_{3 \times 3}\left(\operatorname{MLP}\left(\mathbf{x}\_{i n}\right)\right)\right)\right)+\mathbf{x}\_{i n}$ Conv together with the PE to implement a data-driven PE. The authors of Mix-FFN argue that positional encoding is actually not necessary for semantic segmentation. Instead, they use Mix-FFN which considers the effect of zero padding to leak location information, by directly using a $3 \times 3$ Conv in the feed-forward network (FFN). Mix-FFN can be formulated as:

\mathbf{x}\_{\text {out }}=\operatorname{MLP}\left(\operatorname{GELU}\left(\operatorname{Conv}\_{3 \times 3}\left(\operatorname{MLP}\left(\mathbf{x}\_{i n}\right)\right)\right)\right)+\mathbf{x}\_{i n}

where $\mathbf{x}\_{i n}$ is the feature from a self-attention module. Mix-FFN mixes a $3 \times 3$ convolution and an MLP into each FFN.

Papers Using This Method

From Pixels to Damage Severity: Estimating Earthquake Impacts Using Semantic Segmentation of Social Media Images2025-07-03 Leveraging Modified Ex Situ Tomography Data for Segmentation of In Situ Synchrotron X-Ray Computed Tomography2025-04-27 SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling2025-04-17 Improving underwater semantic segmentation with underwater image quality attention and muti-scale aggregation attention2025-03-30 Comprehensive Evaluation of OCT-based Automated Segmentation of Retinal Layer, Fluid and Hyper-Reflective Foci: Impact on Diabetic Retinopathy Severity Assessment2025-03-03 Cross-Model Transferability of Adversarial Patches in Real-time Segmentation for Autonomous Driving2025-02-22 Global Average Feature Augmentation for Robust Semantic Segmentation with Transformers2024-12-02 SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models2024-11-25 Habaek: High-performance water segmentation through dataset expansion and inductive bias optimization2024-10-21 Risk Assessment for Autonomous Landing in Urban Environments using Semantic Segmentation2024-10-16 Adapting Segment Anything Model to Melanoma Segmentation in Microscopy Slide Images2024-10-03 Semantic Segmentation of Unmanned Aerial Vehicle Remote Sensing Images using SegFormer2024-10-01 AMBER -- Advanced SegFormer for Multi-Band Image Segmentation: an application to Hyperspectral Imaging2024-09-14 SHARP-Net: A Refined Pyramid Network for Deficiency Segmentation in Culverts and Sewer Pipes2024-08-02 MagicBathyNet: A Multimodal Remote Sensing Dataset for Bathymetry Prediction and Pixel-based Classification in Shallow Waters2024-05-24 Segformer++: Efficient Token-Merging Strategies for High-Resolution Semantic Segmentation2024-05-23 wmh_seg: Transformer based U-Net for Robust and Automatic White Matter Hyperintensity Segmentation across 1.5T, 3T and 7T2024-02-20 Kitchen Food Waste Image Segmentation and Classification for Compost Nutrients Estimation2024-01-26 CaBuAr: California Burned Areas dataset for delineation2024-01-21 U-MixFormer: UNet-like Transformer with Mix-Attention for Efficient Semantic Segmentation2023-12-11