TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/In Defense of Pre-trained ImageNet Architectures for Real-...

In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images

Marin Oršić, Ivan Krešo, Petra Bevandić, Siniša Šegvić

2019-03-20Real-Time Semantic SegmentationSemantic Segmentation
PaperPDFCode(official)CodeCodeCodeCodeCode

Abstract

Recent success of semantic segmentation approaches on demanding road driving datasets has spurred interest in many related application fields. Many of these applications involve real-time prediction on mobile platforms such as cars, drones and various kinds of robots. Real-time setup is challenging due to extraordinary computational complexity involved. Many previous works address the challenge with custom lightweight architectures which decrease computational complexity by reducing depth, width and layer capacity with respect to general purpose architectures. We propose an alternative approach which achieves a significantly better performance across a wide range of computing budgets. First, we rely on a light-weight general purpose architecture as the main recognition engine. Then, we leverage light-weight upsampling with lateral connections as the most cost-effective solution to restore the prediction resolution. Finally, we propose to enlarge the receptive field by fusing shared features at multiple resolutions in a novel fashion. Experiments on several road driving datasets show a substantial advantage of the proposed approach, either with ImageNet pre-trained parameters or when we learn from scratch. Our Cityscapes test submission entitled SwiftNetRN-18 delivers 75.5% MIoU and achieves 39.9 Hz on 1024x2048 images on GTX1080Ti.

Results

TaskDatasetMetricValueModel
Semantic SegmentationZJU-RGB-PmIoU80.3SwiftNet (RGB)
Semantic SegmentationEventScapemIoU36.67SwiftNet
Semantic SegmentationDADA-segmIoU20.5SwiftNet (ResNet-18)
10-shot image generationZJU-RGB-PmIoU80.3SwiftNet (RGB)
10-shot image generationEventScapemIoU36.67SwiftNet
10-shot image generationDADA-segmIoU20.5SwiftNet (ResNet-18)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV2025-07-15