Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Antti Tarvainen, Harri Valpola

2017-03-06NeurIPS 2017 12Semi-Supervised Semantic Segmentation Semi-Supervised RGBD Semantic Segmentation Source Free Object Detection Semi-Supervised Image Classification

Paper PDF Code Code Code Code Code Code(official)Code Code

Abstract

The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks. It maintains an exponential moving average of label predictions on each training example, and penalizes predictions that are inconsistent with this target. However, because the targets change only once per epoch, Temporal Ensembling becomes unwieldy when learning large datasets. To overcome this problem, we propose Mean Teacher, a method that averages model weights instead of label predictions. As an additional benefit, Mean Teacher improves test accuracy and enables training with fewer labels than Temporal Ensembling. Without changing the network architecture, Mean Teacher achieves an error rate of 4.35% on SVHN with 250 labels, outperforming Temporal Ensembling trained with 1000 labels. We also show that a good network architecture is crucial to performance. Combining Mean Teacher and Residual Networks, we improve the state of the art on CIFAR-10 with 4000 labels from 10.55% to 6.28%, and on ImageNet 2012 with 10% of the labels from 35.24% to 9.11%.

Results

Task	Dataset	Metric	Value	Model
Domain Adaptation	Cityscapes to Foggy Cityscapes	AP50	34.3	MT
Semantic Segmentation	ScribbleKITTI	mIoU (1% Labels)	41	MeanTeacher (Voxel)
Semantic Segmentation	ScribbleKITTI	mIoU (10% Labels)	50.1	MeanTeacher (Voxel)
Semantic Segmentation	ScribbleKITTI	mIoU (20% Labels)	52.8	MeanTeacher (Voxel)
Semantic Segmentation	ScribbleKITTI	mIoU (50% Labels)	53.9	MeanTeacher (Voxel)
Semantic Segmentation	ScribbleKITTI	mIoU (1% Labels)	34.2	MeanTeacher (Range View)
Semantic Segmentation	ScribbleKITTI	mIoU (10% Labels)	49.8	MeanTeacher (Range View)
Semantic Segmentation	ScribbleKITTI	mIoU (20% Labels)	51.6	MeanTeacher (Range View)
Semantic Segmentation	ScribbleKITTI	mIoU (50% Labels)	53.3	MeanTeacher (Range View)
Semantic Segmentation	SemanticKITTI	mIoU (1% Labels)	45.4	MeanTeacher (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (10% Labels)	57.1	MeanTeacher (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (20% Labels)	59.2	MeanTeacher (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (50% Labels)	60	MeanTeacher (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (1% Labels)	37.5	MeanTeacher (Range View)
Semantic Segmentation	SemanticKITTI	mIoU (10% Labels)	53.1	MeanTeacher (Range View)
Semantic Segmentation	SemanticKITTI	mIoU (20% Labels)	56.1	MeanTeacher (Range View)
Semantic Segmentation	SemanticKITTI	mIoU (50% Labels)	57.4	MeanTeacher (Range View)
Semantic Segmentation	nuScenes	mIoU (1% Labels)	51.6	MeanTeacher (Voxel)
Semantic Segmentation	nuScenes	mIoU (10% Labels)	66	MeanTeacher (Voxel)
Semantic Segmentation	nuScenes	mIoU (20% Labels)	67.1	MeanTeacher (Voxel)
Semantic Segmentation	nuScenes	mIoU (50% Labels)	71.7	MeanTeacher (Voxel)
Semantic Segmentation	nuScenes	mIoU (1% Labels)	42.1	MeanTeacher (Range View)
Semantic Segmentation	nuScenes	mIoU (10% Labels)	60.4	MeanTeacher (Range View)
Semantic Segmentation	nuScenes	mIoU (20% Labels)	65.4	MeanTeacher (Range View)
Semantic Segmentation	nuScenes	mIoU (50% Labels)	69.4	MeanTeacher (Range View)
Image Classification	CIFAR-10, 4000 Labels	Percentage error	6.28	Mean Teacher
Image Classification	SVHN, 1000 labels	Accuracy	96.05	Mean Teacher
Image Classification	SVHN, 250 Labels	Accuracy	93.55	MeanTeacher
Image Classification	CIFAR-10, 250 Labels	Percentage error	47.32	MeanTeacher
Semi-Supervised Image Classification	CIFAR-10, 4000 Labels	Percentage error	6.28	Mean Teacher
Semi-Supervised Image Classification	SVHN, 1000 labels	Accuracy	96.05	Mean Teacher
Semi-Supervised Image Classification	SVHN, 250 Labels	Accuracy	93.55	MeanTeacher
Semi-Supervised Image Classification	CIFAR-10, 250 Labels	Percentage error	47.32	MeanTeacher
10-shot image generation	ScribbleKITTI	mIoU (1% Labels)	41	MeanTeacher (Voxel)
10-shot image generation	ScribbleKITTI	mIoU (10% Labels)	50.1	MeanTeacher (Voxel)
10-shot image generation	ScribbleKITTI	mIoU (20% Labels)	52.8	MeanTeacher (Voxel)
10-shot image generation	ScribbleKITTI	mIoU (50% Labels)	53.9	MeanTeacher (Voxel)
10-shot image generation	ScribbleKITTI	mIoU (1% Labels)	34.2	MeanTeacher (Range View)
10-shot image generation	ScribbleKITTI	mIoU (10% Labels)	49.8	MeanTeacher (Range View)
10-shot image generation	ScribbleKITTI	mIoU (20% Labels)	51.6	MeanTeacher (Range View)
10-shot image generation	ScribbleKITTI	mIoU (50% Labels)	53.3	MeanTeacher (Range View)
10-shot image generation	SemanticKITTI	mIoU (1% Labels)	45.4	MeanTeacher (Voxel)
10-shot image generation	SemanticKITTI	mIoU (10% Labels)	57.1	MeanTeacher (Voxel)
10-shot image generation	SemanticKITTI	mIoU (20% Labels)	59.2	MeanTeacher (Voxel)
10-shot image generation	SemanticKITTI	mIoU (50% Labels)	60	MeanTeacher (Voxel)
10-shot image generation	SemanticKITTI	mIoU (1% Labels)	37.5	MeanTeacher (Range View)
10-shot image generation	SemanticKITTI	mIoU (10% Labels)	53.1	MeanTeacher (Range View)
10-shot image generation	SemanticKITTI	mIoU (20% Labels)	56.1	MeanTeacher (Range View)
10-shot image generation	SemanticKITTI	mIoU (50% Labels)	57.4	MeanTeacher (Range View)
10-shot image generation	nuScenes	mIoU (1% Labels)	51.6	MeanTeacher (Voxel)
10-shot image generation	nuScenes	mIoU (10% Labels)	66	MeanTeacher (Voxel)
10-shot image generation	nuScenes	mIoU (20% Labels)	67.1	MeanTeacher (Voxel)
10-shot image generation	nuScenes	mIoU (50% Labels)	71.7	MeanTeacher (Voxel)
10-shot image generation	nuScenes	mIoU (1% Labels)	42.1	MeanTeacher (Range View)
10-shot image generation	nuScenes	mIoU (10% Labels)	60.4	MeanTeacher (Range View)
10-shot image generation	nuScenes	mIoU (20% Labels)	65.4	MeanTeacher (Range View)
10-shot image generation	nuScenes	mIoU (50% Labels)	69.4	MeanTeacher (Range View)
Source-Free Domain Adaptation	Cityscapes to Foggy Cityscapes	AP50	34.3	MT

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Abstract

Results

Related Papers

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Abstract

Results

Related Papers