TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CounTR: Transformer-based Generalised Visual Counting

CounTR: Transformer-based Generalised Visual Counting

Chang Liu, Yujie Zhong, Andrew Zisserman, Weidi Xie

2022-08-29Self-Supervised LearningObject CountingExemplar-Free Counting
PaperPDFCode(official)

Abstract

In this paper, we consider the problem of generalised visual object counting, with the goal of developing a computational model for counting the number of objects from arbitrary semantic categories, using arbitrary number of "exemplars", i.e. zero-shot or few-shot counting. To this end, we make the following four contributions: (1) We introduce a novel transformer-based architecture for generalised visual object counting, termed as Counting Transformer (CounTR), which explicitly capture the similarity between image patches or with given "exemplars" with the attention mechanism;(2) We adopt a two-stage training regime, that first pre-trains the model with self-supervised learning, and followed by supervised fine-tuning;(3) We propose a simple, scalable pipeline for synthesizing training images with a large number of instances or that from different semantic categories, explicitly forcing the model to make use of the given "exemplars";(4) We conduct thorough ablation studies on the large-scale counting benchmark, e.g. FSC-147, and demonstrate state-of-the-art performance on both zero and few-shot settings.

Results

TaskDatasetMetricValueModel
Object CountingFSC147MAE(test)11.95CounTR
Object CountingFSC147MAE(val)13.13CounTR
Object CountingFSC147RMSE(test)91.23CounTR
Object CountingFSC147RMSE(val)49.83CounTR
Object CountingCARPKMAE5.75CounTR
Object CountingCARPKRMSE7.45CounTR
Object CountingFSC147MAE(test)14.71CounTR
Object CountingFSC147MAE(val)18.07CounTR
Object CountingFSC147RMSE(test)106.87CounTR
Object CountingFSC147RMSE(val)71.84CounTR

Related Papers

A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys2025-07-17Self-supervised Learning on Camera Trap Footage Yields a Strong Universal Face Embedder2025-07-14Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework2025-07-11Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis2025-07-08World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model2025-07-01ShapeEmbed: a self-supervised learning framework for 2D contour quantification2025-07-01RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models2025-06-27Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features2025-06-26