Rui Yann, Xianglei Xing
We present ViTSGMM, an image recognition network that leverages semi-supervised learning in a highly efficient manner. Existing works often rely on complex training techniques and architectures, while their generalization ability when dealing with extremely limited labeled data remains to be improved. To address these limitations, we construct a hierarchical mixture density classification decision mechanism by optimizing mutual information between feature representations and target classes, compressing redundant information while retaining crucial discriminative components. Experimental results demonstrate that our method achieves state-of-the-art performance on STL-10 and CIFAR-10/100 datasets when using negligible labeled samples. Notably, this paper also reveals a long-overlooked data leakage issue in the STL-10 dataset for semi-supervised learning tasks and removes duplicates to ensure the reliability of experimental results. Code available at https://github.com/Shu1L0n9/ViTSGMM.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Classification | CIFAR-100, 2500 Labels | Percentage error | 22.19 | SemiOccam |
| Image Classification | CIFAR-100, 400 Labels | Percentage error | 26.59 | SemiOccam |
| Image Classification | STL-10, 40 Labels | Accuracy | 95.43 | SemiOccam |
| Image Classification | CIFAR-10, 40 Labels | Percentage error | 3.51 | SemiOccam |
| Image Classification | CIFAR-10, 250 Labels | Percentage error | 3.47 | SemiOccam |
| Semi-Supervised Image Classification | CIFAR-100, 2500 Labels | Percentage error | 22.19 | SemiOccam |
| Semi-Supervised Image Classification | CIFAR-100, 400 Labels | Percentage error | 26.59 | SemiOccam |
| Semi-Supervised Image Classification | STL-10, 40 Labels | Accuracy | 95.43 | SemiOccam |
| Semi-Supervised Image Classification | CIFAR-10, 40 Labels | Percentage error | 3.51 | SemiOccam |
| Semi-Supervised Image Classification | CIFAR-10, 250 Labels | Percentage error | 3.47 | SemiOccam |