TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Hard Regularization to Prevent Deep Online Clustering Coll...

Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation

Louis Mahon, Thomas Lukasiewicz

2023-03-29Deep ClusteringOnline ClusteringData AugmentationHuman Activity RecognitionClusteringActivity Recognition
PaperPDFCode(official)

Abstract

Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. Successful existing models have employed various techniques to avoid this problem, most of which require data augmentation or which aim to make the average soft assignment across the dataset the same for each cluster. We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments. Using a Bayesian framework, we derive an intuitive optimization objective that can be straightforwardly included in the training of the encoder network. Tested on four image datasets and one human-activity recognition dataset, it consistently avoids collapse more robustly than other methods and leads to more accurate clustering. We also conduct further experiments and analyses justifying our choice to regularize the hard cluster assignments. Code is available at https://github.com/Lou1sM/online_hard_clustering.

Results

TaskDatasetMetricValueModel
Image Clusteringcifar10online ACC21.7OHC
Image Clusteringcifar10online ARI5.4OHC
Image Clusteringcifar10online NMI10.5OHC

Related Papers

Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Ranking Vectors Clustering: Theory and Applications2025-07-16Data Augmentation in Time Series Forecasting through Inverted Framework2025-07-15ZKP-FedEval: Verifiable and Privacy-Preserving Federated Evaluation using Zero-Knowledge Proofs2025-07-15Iceberg: Enhancing HLS Modeling with Synthetic Data2025-07-14