Learning from Spatio-temporal Correlation for Semi-Supervised LiDAR Semantic Segmentation

Seungho Lee, Hwijeong Lee, Hyunjung Shim

2024-10-09Semi-Supervised Semantic Segmentation Semantic Segmentation LIDAR Semantic Segmentation

Abstract

We address the challenges of the semi-supervised LiDAR segmentation (SSLS) problem, particularly in low-budget scenarios. The two main issues in low-budget SSLS are the poor-quality pseudo-labels for unlabeled data, and the performance drops due to the significant imbalance between ground-truth and pseudo-labels. This imbalance leads to a vicious training cycle. To overcome these challenges, we leverage the spatio-temporal prior by recognizing the substantial overlap between temporally adjacent LiDAR scans. We propose a proximity-based label estimation, which generates highly accurate pseudo-labels for unlabeled data by utilizing semantic consistency with adjacent labeled data. Additionally, we enhance this method by progressively expanding the pseudo-labels from the nearest unlabeled scans, which helps significantly reduce errors linked to dynamic classes. Additionally, we employ a dual-branch structure to mitigate performance degradation caused by data imbalance. Experimental results demonstrate remarkable performance in low-budget settings (i.e., <= 5%) and meaningful improvements in normal budget settings (i.e., 5 - 50%). Finally, our method has achieved new state-of-the-art results on SemanticKITTI and nuScenes in semi-supervised LiDAR segmentation. With only 5% labeled data, it offers competitive results against fully-supervised counterparts. Moreover, it surpasses the performance of the previous state-of-the-art at 100% labeled data (75.2%) using only 20% of labeled data (76.0%) on nuScenes. The code is available on https://github.com/halbielee/PLE.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	SemanticKITTI	mIoU (0.5% Labels)	52.2	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (1% Labels)	61.1	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (10% Labels)	63.1	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (2% Labels)	62.9	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (20% Labels)	64.1	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (5% Labels)	62.8	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (50% Labels)	64.3	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (0.5% Labels)	47.3	LaserMix (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (2% Labels)	59.2	LaserMix (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (5% Labels)	61.7	LaserMix (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (0.5% Labels)	46.2	PLE (CENet, Range view)
Semantic Segmentation	SemanticKITTI	mIoU (1% Labels)	51.5	PLE (CENet, Range view)
Semantic Segmentation	SemanticKITTI	mIoU (2% Labels)	54.3	PLE (CENet, Range view)
Semantic Segmentation	SemanticKITTI	mIoU (5% Labels)	58.1	PLE (CENet, Range view)
Semantic Segmentation	nuScenes	mIoU (0.5% Labels)	58	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (1% Labels)	62.9	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (10% Labels)	74.3	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (2% Labels)	67.2	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (20% Labels)	76	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (5% Labels)	72.8	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (50% Labels)	76.1	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (0.5% Labels)	51.4	LaserMix (Voxel)
Semantic Segmentation	nuScenes	mIoU (2% Labels)	63.9	LaserMix (Voxel)
Semantic Segmentation	nuScenes	mIoU (5% Labels)	69.7	LaserMix (Voxel)
10-shot image generation	SemanticKITTI	mIoU (0.5% Labels)	52.2	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (1% Labels)	61.1	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (10% Labels)	63.1	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (2% Labels)	62.9	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (20% Labels)	64.1	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (5% Labels)	62.8	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (50% Labels)	64.3	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (0.5% Labels)	47.3	LaserMix (Voxel)
10-shot image generation	SemanticKITTI	mIoU (2% Labels)	59.2	LaserMix (Voxel)
10-shot image generation	SemanticKITTI	mIoU (5% Labels)	61.7	LaserMix (Voxel)
10-shot image generation	SemanticKITTI	mIoU (0.5% Labels)	46.2	PLE (CENet, Range view)
10-shot image generation	SemanticKITTI	mIoU (1% Labels)	51.5	PLE (CENet, Range view)
10-shot image generation	SemanticKITTI	mIoU (2% Labels)	54.3	PLE (CENet, Range view)
10-shot image generation	SemanticKITTI	mIoU (5% Labels)	58.1	PLE (CENet, Range view)
10-shot image generation	nuScenes	mIoU (0.5% Labels)	58	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (1% Labels)	62.9	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (10% Labels)	74.3	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (2% Labels)	67.2	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (20% Labels)	76	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (5% Labels)	72.8	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (50% Labels)	76.1	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (0.5% Labels)	51.4	LaserMix (Voxel)
10-shot image generation	nuScenes	mIoU (2% Labels)	63.9	LaserMix (Voxel)
10-shot image generation	nuScenes	mIoU (5% Labels)	69.7	LaserMix (Voxel)

Abstract

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	SemanticKITTI	mIoU (0.5% Labels)	52.2	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (1% Labels)	61.1	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (10% Labels)	63.1	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (2% Labels)	62.9	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (20% Labels)	64.1	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (5% Labels)	62.8	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (50% Labels)	64.3	PLE (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (0.5% Labels)	47.3	LaserMix (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (2% Labels)	59.2	LaserMix (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (5% Labels)	61.7	LaserMix (Voxel)
Semantic Segmentation	SemanticKITTI	mIoU (0.5% Labels)	46.2	PLE (CENet, Range view)
Semantic Segmentation	SemanticKITTI	mIoU (1% Labels)	51.5	PLE (CENet, Range view)
Semantic Segmentation	SemanticKITTI	mIoU (2% Labels)	54.3	PLE (CENet, Range view)
Semantic Segmentation	SemanticKITTI	mIoU (5% Labels)	58.1	PLE (CENet, Range view)
Semantic Segmentation	nuScenes	mIoU (0.5% Labels)	58	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (1% Labels)	62.9	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (10% Labels)	74.3	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (2% Labels)	67.2	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (20% Labels)	76	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (5% Labels)	72.8	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (50% Labels)	76.1	PLE (Voxel)
Semantic Segmentation	nuScenes	mIoU (0.5% Labels)	51.4	LaserMix (Voxel)
Semantic Segmentation	nuScenes	mIoU (2% Labels)	63.9	LaserMix (Voxel)
Semantic Segmentation	nuScenes	mIoU (5% Labels)	69.7	LaserMix (Voxel)
10-shot image generation	SemanticKITTI	mIoU (0.5% Labels)	52.2	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (1% Labels)	61.1	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (10% Labels)	63.1	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (2% Labels)	62.9	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (20% Labels)	64.1	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (5% Labels)	62.8	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (50% Labels)	64.3	PLE (Voxel)
10-shot image generation	SemanticKITTI	mIoU (0.5% Labels)	47.3	LaserMix (Voxel)
10-shot image generation	SemanticKITTI	mIoU (2% Labels)	59.2	LaserMix (Voxel)
10-shot image generation	SemanticKITTI	mIoU (5% Labels)	61.7	LaserMix (Voxel)
10-shot image generation	SemanticKITTI	mIoU (0.5% Labels)	46.2	PLE (CENet, Range view)
10-shot image generation	SemanticKITTI	mIoU (1% Labels)	51.5	PLE (CENet, Range view)
10-shot image generation	SemanticKITTI	mIoU (2% Labels)	54.3	PLE (CENet, Range view)
10-shot image generation	SemanticKITTI	mIoU (5% Labels)	58.1	PLE (CENet, Range view)
10-shot image generation	nuScenes	mIoU (0.5% Labels)	58	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (1% Labels)	62.9	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (10% Labels)	74.3	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (2% Labels)	67.2	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (20% Labels)	76	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (5% Labels)	72.8	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (50% Labels)	76.1	PLE (Voxel)
10-shot image generation	nuScenes	mIoU (0.5% Labels)	51.4	LaserMix (Voxel)
10-shot image generation	nuScenes	mIoU (2% Labels)	63.9	LaserMix (Voxel)
10-shot image generation	nuScenes	mIoU (5% Labels)	69.7	LaserMix (Voxel)

Learning from Spatio-temporal Correlation for Semi-Supervised LiDAR Semantic Segmentation

Abstract

Results

Related Papers

Learning from Spatio-temporal Correlation for Semi-Supervised LiDAR Semantic Segmentation

Abstract

Results

Related Papers