TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unsupervised Selective Labeling for More Effective Semi-Su...

Unsupervised Selective Labeling for More Effective Semi-Supervised Learning

Xudong Wang, Long Lian, Stella X. Yu

2021-10-06Active LearningSemi-Supervised Image Classification (Cold Start)
PaperPDFCode(official)

Abstract

Given an unlabeled dataset and an annotation budget, we study how to selectively label a fixed number of instances so that semi-supervised learning (SSL) on such a partially labeled dataset is most effective. We focus on selecting the right data to label, in addition to usual SSL's propagating labels from labeled data to the rest unlabeled data. This instance selection task is challenging, as without any labeled data we do not know what the objective of learning should be. Intuitively, no matter what the downstream task is, instances to be labeled must be representative and diverse: The former would facilitate label propagation to unlabeled data, whereas the latter would ensure coverage of the entire dataset. We capture this idea by selecting cluster prototypes, either in a pretrained feature space, or along with feature optimization, both without labels. Our unsupervised selective labeling consistently improves SSL methods over state-of-the-art active learning given labeled data, by 8 to 25 times in label efficiency. For example, it boosts FixMatch by 10% (14%) in accuracy on CIFAR-10 (ImageNet-1K) with 0.08% (0.2%) labeled data, demonstrating that small computation spent on selecting what data to label brings significant gain especially under a low annotation budget. Our work sets a new standard for practical and efficient SSL.

Results

TaskDatasetMetricValueModel
Image ClassificationCIFAR-10, 100 LabelsPercentage error6.8FixMatch-USL
Image ClassificationCIFAR-10, 40 LabelsPercentage error6.5FixMatch-USL-T
Semi-Supervised Image ClassificationCIFAR-10, 100 LabelsPercentage error6.8FixMatch-USL
Semi-Supervised Image ClassificationCIFAR-10, 40 LabelsPercentage error6.5FixMatch-USL-T

Related Papers

A Risk-Aware Adaptive Robust MPC with Learned Uncertainty Quantification2025-07-15CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization2025-07-08MP-ALOE: An r2SCAN dataset for universal machine learning interatomic potentials2025-07-08Active Learning for Manifold Gaussian Process Regression2025-06-26Machine-Learning-Assisted Photonic Device Development: A Multiscale Approach from Theory to Characterization2025-06-24Active Learning-Guided Seq2Seq Variational Autoencoder for Multi-target Inhibitor Generation2025-06-18Bayesian Active Learning of (small) Quantile Sets through Expected Estimator Modification2025-06-16Coupled reaction and diffusion governing interface evolution in solid-state batteries2025-06-12