Twin Contrastive Learning for Online Clustering

Yunfan Li, Mouxing Yang, Dezhong Peng, Taihao Li, Jiantao Huang, Xi Peng

2022-10-21Short Text Clustering Deep Clustering Online Clustering Image Clustering Clustering Contrastive Learning

Abstract

This paper proposes to perform online clustering by conducting twin contrastive learning (TCL) at the instance and cluster level. Specifically, we find that when the data is projected into a feature space with a dimensionality of the target cluster number, the rows and columns of its feature matrix correspond to the instance and cluster representation, respectively. Based on the observation, for a given dataset, the proposed TCL first constructs positive and negative pairs through data augmentations. Thereafter, in the row and column space of the feature matrix, instance- and cluster-level contrastive learning are respectively conducted by pulling together positive pairs while pushing apart the negatives. To alleviate the influence of intrinsic false-negative pairs and rectify cluster assignments, we adopt a confidence-based criterion to select pseudo-labels for boosting both the instance- and cluster-level contrastive learning. As a result, the clustering performance is further improved. Besides the elegant idea of twin contrastive learning, another advantage of TCL is that it could independently predict the cluster assignment for each instance, thus effortlessly fitting online scenarios. Extensive experiments on six widely-used image and text benchmarks demonstrate the effectiveness of TCL. The code will be released on GitHub.

Results

Task	Dataset	Metric	Value	Model
Text Clustering	Stackoverflow	Acc	88.2	TCL
Text Clustering	Stackoverflow	NMI	0.786	TCL
Text Clustering	Biomedical	Acc	49.8	TCL
Text Clustering	Biomedical	NMI	42.9	TCL
Image Clustering	ImageNet-10	ARI	0.837	TCL
Image Clustering	ImageNet-10	Accuracy	0.895	TCL
Image Clustering	ImageNet-10	NMI	0.875	TCL
Image Clustering	CIFAR-10	ARI	0.78	TCL
Image Clustering	CIFAR-10	Accuracy	0.887	TCL
Image Clustering	CIFAR-10	NMI	0.819	TCL
Image Clustering	CIFAR-100	ARI	0.357	TCL
Image Clustering	CIFAR-100	Accuracy	0.531	TCL
Image Clustering	CIFAR-100	NMI	0.529	TCL
Image Clustering	STL-10	ARI	0.757	TCL
Image Clustering	STL-10	Accuracy	0.868	TCL
Image Clustering	STL-10	NMI	0.799	TCL
Image Clustering	Imagenet-dog-15	ARI	0.516	TCL
Image Clustering	Imagenet-dog-15	Accuracy	0.644	TCL
Image Clustering	Imagenet-dog-15	NMI	0.623	TCL

Twin Contrastive Learning for Online Clustering

Abstract

Results

Related Papers

Twin Contrastive Learning for Online Clustering

Abstract

Results

Related Papers