TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/BiSeNet V2: Bilateral Network with Guided Aggregation for ...

BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation

Changqian Yu, Changxin Gao, Jingbo Wang, Gang Yu, Chunhua Shen, Nong Sang

2020-04-05Real-Time Semantic SegmentationSegmentationSemantic Segmentation
PaperPDFCodeCodeCodeCodeCodeCodeCode

Abstract

The low-level details and high-level semantics are both essential to the semantic segmentation task. However, to speed up the model inference, current approaches almost always sacrifice the low-level details, which leads to a considerable accuracy decrease. We propose to treat these spatial details and categorical semantics separately to achieve high accuracy and high efficiency for realtime semantic segmentation. To this end, we propose an efficient and effective architecture with a good trade-off between speed and accuracy, termed Bilateral Segmentation Network (BiSeNet V2). This architecture involves: (i) a Detail Branch, with wide channels and shallow layers to capture low-level details and generate high-resolution feature representation; (ii) a Semantic Branch, with narrow channels and deep layers to obtain high-level semantic context. The Semantic Branch is lightweight due to reducing the channel capacity and a fast-downsampling strategy. Furthermore, we design a Guided Aggregation Layer to enhance mutual connections and fuse both types of feature representation. Besides, a booster training strategy is designed to improve the segmentation performance without any extra inference cost. Extensive quantitative and qualitative evaluations demonstrate that the proposed architecture performs favourably against a few state-of-the-art real-time semantic segmentation approaches. Specifically, for a 2,048x1,024 input, we achieve 72.6% Mean IoU on the Cityscapes test set with a speed of 156 FPS on one NVIDIA GeForce GTX 1080 Ti card, which is significantly faster than existing methods, yet we achieve better segmentation accuracy.

Results

TaskDatasetMetricValueModel
Semantic SegmentationCityscapes testFrame (fps)47.3BiSeNet V2-Large
Semantic SegmentationCityscapes testTime (ms)21.1BiSeNet V2-Large
Semantic SegmentationCityscapes testFrame (fps)156BiSeNet V2
Semantic SegmentationCityscapes testTime (ms)6.4BiSeNet V2
Semantic SegmentationCamVidFrame (fps)32.7BiSeNet V2-Large(Cityscapes-Pretrained)
Semantic SegmentationCamVidTime (ms)30.6BiSeNet V2-Large(Cityscapes-Pretrained)
Semantic SegmentationCamVidmIoU78.5BiSeNet V2-Large(Cityscapes-Pretrained)
Semantic SegmentationCamVidFrame (fps)124.5BiSeNet V2(Cityscapes-Pretrained)
Semantic SegmentationCamVidTime (ms)8BiSeNet V2(Cityscapes-Pretrained)
Semantic SegmentationCamVidmIoU76.7BiSeNet V2(Cityscapes-Pretrained)
Semantic SegmentationCamVidFrame (fps)32.7BiSeNet V2-Large
Semantic SegmentationCamVidTime (ms)30.6BiSeNet V2-Large
Semantic SegmentationCamVidmIoU73.2BiSeNet V2-Large
Semantic SegmentationCamVidFrame (fps)124.5BiSeNet V2
Semantic SegmentationCamVidTime (ms)8BiSeNet V2
Semantic SegmentationCamVidmIoU72.4BiSeNet V2
Semantic SegmentationCOCO-StuffmIoU28.7BiSeNet V2-Large
Semantic SegmentationCOCO-StuffmIoU25.2BiSeNet V2
Semantic SegmentationCityscapes valFrame (fps)47.3BiseNetV2-L
Semantic SegmentationCityscapes valFrame (fps)156BiseNetV2
10-shot image generationCityscapes testFrame (fps)47.3BiSeNet V2-Large
10-shot image generationCityscapes testTime (ms)21.1BiSeNet V2-Large
10-shot image generationCityscapes testFrame (fps)156BiSeNet V2
10-shot image generationCityscapes testTime (ms)6.4BiSeNet V2
10-shot image generationCamVidFrame (fps)32.7BiSeNet V2-Large(Cityscapes-Pretrained)
10-shot image generationCamVidTime (ms)30.6BiSeNet V2-Large(Cityscapes-Pretrained)
10-shot image generationCamVidmIoU78.5BiSeNet V2-Large(Cityscapes-Pretrained)
10-shot image generationCamVidFrame (fps)124.5BiSeNet V2(Cityscapes-Pretrained)
10-shot image generationCamVidTime (ms)8BiSeNet V2(Cityscapes-Pretrained)
10-shot image generationCamVidmIoU76.7BiSeNet V2(Cityscapes-Pretrained)
10-shot image generationCamVidFrame (fps)32.7BiSeNet V2-Large
10-shot image generationCamVidTime (ms)30.6BiSeNet V2-Large
10-shot image generationCamVidmIoU73.2BiSeNet V2-Large
10-shot image generationCamVidFrame (fps)124.5BiSeNet V2
10-shot image generationCamVidTime (ms)8BiSeNet V2
10-shot image generationCamVidmIoU72.4BiSeNet V2
10-shot image generationCOCO-StuffmIoU28.7BiSeNet V2-Large
10-shot image generationCOCO-StuffmIoU25.2BiSeNet V2
10-shot image generationCityscapes valFrame (fps)47.3BiseNetV2-L
10-shot image generationCityscapes valFrame (fps)156BiseNetV2

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17