TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Mean teachers are better role models: Weight-averaged cons...

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Antti Tarvainen, Harri Valpola

2017-03-06NeurIPS 2017 12Semi-Supervised Semantic SegmentationSemi-Supervised RGBD Semantic SegmentationSource Free Object DetectionSemi-Supervised Image Classification
PaperPDFCodeCodeCodeCodeCodeCode(official)CodeCode

Abstract

The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks. It maintains an exponential moving average of label predictions on each training example, and penalizes predictions that are inconsistent with this target. However, because the targets change only once per epoch, Temporal Ensembling becomes unwieldy when learning large datasets. To overcome this problem, we propose Mean Teacher, a method that averages model weights instead of label predictions. As an additional benefit, Mean Teacher improves test accuracy and enables training with fewer labels than Temporal Ensembling. Without changing the network architecture, Mean Teacher achieves an error rate of 4.35% on SVHN with 250 labels, outperforming Temporal Ensembling trained with 1000 labels. We also show that a good network architecture is crucial to performance. Combining Mean Teacher and Residual Networks, we improve the state of the art on CIFAR-10 with 4000 labels from 10.55% to 6.28%, and on ImageNet 2012 with 10% of the labels from 35.24% to 9.11%.

Results

TaskDatasetMetricValueModel
Domain AdaptationCityscapes to Foggy CityscapesAP5034.3MT
Semantic SegmentationScribbleKITTImIoU (1% Labels)41MeanTeacher (Voxel)
Semantic SegmentationScribbleKITTImIoU (10% Labels)50.1MeanTeacher (Voxel)
Semantic SegmentationScribbleKITTImIoU (20% Labels)52.8MeanTeacher (Voxel)
Semantic SegmentationScribbleKITTImIoU (50% Labels)53.9MeanTeacher (Voxel)
Semantic SegmentationScribbleKITTImIoU (1% Labels)34.2MeanTeacher (Range View)
Semantic SegmentationScribbleKITTImIoU (10% Labels)49.8MeanTeacher (Range View)
Semantic SegmentationScribbleKITTImIoU (20% Labels)51.6MeanTeacher (Range View)
Semantic SegmentationScribbleKITTImIoU (50% Labels)53.3MeanTeacher (Range View)
Semantic SegmentationSemanticKITTImIoU (1% Labels)45.4MeanTeacher (Voxel)
Semantic SegmentationSemanticKITTImIoU (10% Labels)57.1MeanTeacher (Voxel)
Semantic SegmentationSemanticKITTImIoU (20% Labels)59.2MeanTeacher (Voxel)
Semantic SegmentationSemanticKITTImIoU (50% Labels)60MeanTeacher (Voxel)
Semantic SegmentationSemanticKITTImIoU (1% Labels)37.5MeanTeacher (Range View)
Semantic SegmentationSemanticKITTImIoU (10% Labels)53.1MeanTeacher (Range View)
Semantic SegmentationSemanticKITTImIoU (20% Labels)56.1MeanTeacher (Range View)
Semantic SegmentationSemanticKITTImIoU (50% Labels)57.4MeanTeacher (Range View)
Semantic SegmentationnuScenesmIoU (1% Labels)51.6MeanTeacher (Voxel)
Semantic SegmentationnuScenesmIoU (10% Labels)66MeanTeacher (Voxel)
Semantic SegmentationnuScenesmIoU (20% Labels)67.1MeanTeacher (Voxel)
Semantic SegmentationnuScenesmIoU (50% Labels)71.7MeanTeacher (Voxel)
Semantic SegmentationnuScenesmIoU (1% Labels)42.1MeanTeacher (Range View)
Semantic SegmentationnuScenesmIoU (10% Labels)60.4MeanTeacher (Range View)
Semantic SegmentationnuScenesmIoU (20% Labels)65.4MeanTeacher (Range View)
Semantic SegmentationnuScenesmIoU (50% Labels)69.4MeanTeacher (Range View)
Image ClassificationCIFAR-10, 4000 LabelsPercentage error6.28Mean Teacher
Image ClassificationSVHN, 1000 labelsAccuracy96.05Mean Teacher
Image ClassificationSVHN, 250 LabelsAccuracy93.55MeanTeacher
Image ClassificationCIFAR-10, 250 LabelsPercentage error47.32MeanTeacher
Semi-Supervised Image ClassificationCIFAR-10, 4000 LabelsPercentage error6.28Mean Teacher
Semi-Supervised Image ClassificationSVHN, 1000 labelsAccuracy96.05Mean Teacher
Semi-Supervised Image ClassificationSVHN, 250 LabelsAccuracy93.55MeanTeacher
Semi-Supervised Image ClassificationCIFAR-10, 250 LabelsPercentage error47.32MeanTeacher
10-shot image generationScribbleKITTImIoU (1% Labels)41MeanTeacher (Voxel)
10-shot image generationScribbleKITTImIoU (10% Labels)50.1MeanTeacher (Voxel)
10-shot image generationScribbleKITTImIoU (20% Labels)52.8MeanTeacher (Voxel)
10-shot image generationScribbleKITTImIoU (50% Labels)53.9MeanTeacher (Voxel)
10-shot image generationScribbleKITTImIoU (1% Labels)34.2MeanTeacher (Range View)
10-shot image generationScribbleKITTImIoU (10% Labels)49.8MeanTeacher (Range View)
10-shot image generationScribbleKITTImIoU (20% Labels)51.6MeanTeacher (Range View)
10-shot image generationScribbleKITTImIoU (50% Labels)53.3MeanTeacher (Range View)
10-shot image generationSemanticKITTImIoU (1% Labels)45.4MeanTeacher (Voxel)
10-shot image generationSemanticKITTImIoU (10% Labels)57.1MeanTeacher (Voxel)
10-shot image generationSemanticKITTImIoU (20% Labels)59.2MeanTeacher (Voxel)
10-shot image generationSemanticKITTImIoU (50% Labels)60MeanTeacher (Voxel)
10-shot image generationSemanticKITTImIoU (1% Labels)37.5MeanTeacher (Range View)
10-shot image generationSemanticKITTImIoU (10% Labels)53.1MeanTeacher (Range View)
10-shot image generationSemanticKITTImIoU (20% Labels)56.1MeanTeacher (Range View)
10-shot image generationSemanticKITTImIoU (50% Labels)57.4MeanTeacher (Range View)
10-shot image generationnuScenesmIoU (1% Labels)51.6MeanTeacher (Voxel)
10-shot image generationnuScenesmIoU (10% Labels)66MeanTeacher (Voxel)
10-shot image generationnuScenesmIoU (20% Labels)67.1MeanTeacher (Voxel)
10-shot image generationnuScenesmIoU (50% Labels)71.7MeanTeacher (Voxel)
10-shot image generationnuScenesmIoU (1% Labels)42.1MeanTeacher (Range View)
10-shot image generationnuScenesmIoU (10% Labels)60.4MeanTeacher (Range View)
10-shot image generationnuScenesmIoU (20% Labels)65.4MeanTeacher (Range View)
10-shot image generationnuScenesmIoU (50% Labels)69.4MeanTeacher (Range View)
Source-Free Domain AdaptationCityscapes to Foggy CityscapesAP5034.3MT

Related Papers

SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16DEARLi: Decoupled Enhancement of Recognition and Localization for Semi-supervised Panoptic Segmentation2025-07-14Leveraging Out-of-Distribution Unlabeled Images: Semi-Supervised Semantic Segmentation with an Open-Vocabulary Model2025-07-04HierVL: Semi-Supervised Segmentation leveraging Hierarchical Vision-Language Synergy with Dynamic Text-Spatial Query Alignment2025-06-16FARCLUSS: Fuzzy Adaptive Rebalancing and Contrastive Uncertainty Learning for Semi-Supervised Semantic Segmentation2025-06-11RS-MTDF: Multi-Teacher Distillation and Fusion for Remote Sensing Semi-Supervised Semantic Segmentation2025-06-10ViTSGMM: A Robust Semi-Supervised Image Recognition Network Using Sparse Labels2025-06-04Adaptive Spatial Augmentation for Semi-supervised Semantic Segmentation2025-05-29