Evelyn J. Mannix, Howard D. Bondell
In many machine learning applications, labeling datasets can be an arduous and time-consuming task. Although research has shown that semi-supervised learning techniques can achieve high accuracy with very few labels within the field of computer vision, little attention has been given to how images within a dataset should be selected for labeling. In this paper, we propose a novel approach based on well-established self-supervised learning, clustering, and manifold learning techniques that address this challenge of selecting an informative image subset to label in the first instance, which is known as the cold-start or unsupervised selective labelling problem. We test our approach using several publicly available datasets, namely CIFAR10, Imagenette, DeepWeeds, and EuroSAT, and observe improved performance with both supervised and semi-supervised learning strategies when our label selection strategy is used, in comparison to random sampling. We also obtain superior performance for the datasets considered with a much simpler approach compared to other methods in the literature.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Classification | EuroSAT, 20 Labels | Percentage error | 3.8 | SimCLR-kmediods-PAWS |
| Image Classification | Imagenette, 20 Labels | Percentage error | 10.8 | SimCLR-kmediods-PAWS |
| Image Classification | CIFAR-10, 30 Labels | Percentage error | 6.4 | SimCLR-kmediods-PAWS |
| Image Classification | EuroSAT, 100 Labels | Percentage error | 2.6 | SimCLR-kmediods-PAWS |
| Image Classification | Imagenette, 100 Labels | Percentage error | 6.1 | SimCLR-kmediods-PAWS |
| Image Classification | DeepWeeds, 99 Labels | Percentage error | 19.6 | SimCLR-kmediods-finetuned |
| Image Classification | CIFAR-10, 100 Labels | Percentage error | 6.1 | SimCLR-kmediods-PAWS |
| Image Classification | EuroSAT, 20 Labels | Percentage error | 3.8 | SimCLR-kmediods-PAWS |
| Image Classification | CIFAR-10, 30 Labels | Percentage error | 6.4 | SimCLR-kmediods-PAWS |
| Image Classification | Imagenette, 100 Labels | Percentage error | 6.1 | SimCLR-kmediods-PAWS |
| Image Classification | CIFAR-10, 100 Labels | Percentage error | 6.1 | SimCLR-kmediods-PAWS |
| Image Classification | DeepWeeds, 99 Labels | Percentage error | 19.6 | SimCLR-kmediods-finetuned |
| Image Classification | Imagenette, 20 Labels | Percentage error | 10.8 | SimCLR-kmediods-PAWS |
| Image Classification | EuroSAT, 100 Labels | Percentage error | 2.6 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | EuroSAT, 20 Labels | Percentage error | 3.8 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | Imagenette, 20 Labels | Percentage error | 10.8 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | CIFAR-10, 30 Labels | Percentage error | 6.4 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | EuroSAT, 100 Labels | Percentage error | 2.6 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | Imagenette, 100 Labels | Percentage error | 6.1 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | DeepWeeds, 99 Labels | Percentage error | 19.6 | SimCLR-kmediods-finetuned |
| Semi-Supervised Image Classification | CIFAR-10, 100 Labels | Percentage error | 6.1 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | EuroSAT, 20 Labels | Percentage error | 3.8 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | CIFAR-10, 30 Labels | Percentage error | 6.4 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | Imagenette, 100 Labels | Percentage error | 6.1 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | CIFAR-10, 100 Labels | Percentage error | 6.1 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | DeepWeeds, 99 Labels | Percentage error | 19.6 | SimCLR-kmediods-finetuned |
| Semi-Supervised Image Classification | Imagenette, 20 Labels | Percentage error | 10.8 | SimCLR-kmediods-PAWS |
| Semi-Supervised Image Classification | EuroSAT, 100 Labels | Percentage error | 2.6 | SimCLR-kmediods-PAWS |