TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Counting Everyday Objects in Everyday Scenes

Counting Everyday Objects in Everyday Scenes

Prithvijit Chattopadhyay, Ramakrishna Vedantam, Ramprasaath R. Selvaraju, Dhruv Batra, Devi Parikh

2016-04-12CVPR 2017 7Question AnsweringObject CountingVisual Question Answering (VQA)object-detectionObject DetectionVisual Question Answering
PaperPDFCode(official)

Abstract

We are interested in counting the number of instances of object classes in natural, everyday images. Previous counting approaches tackle the problem in restricted domains such as counting pedestrians in surveillance videos. Counts can also be estimated from outputs of other vision tasks like object detection. In this work, we build dedicated models for counting designed to tackle the large variance in counts, appearances, and scales of objects found in natural scenes. Our approach is inspired by the phenomenon of subitizing - the ability of humans to make quick assessments of counts given a perceptual signal, for small count values. Given a natural scene, we employ a divide and conquer strategy while incorporating context across the scene to adapt the subitizing idea to counting. Our approach offers consistent improvements over numerous baseline approaches for counting on the PASCAL VOC 2007 and COCO datasets. Subsequently, we study how counting can be used to improve object detection. We then show a proof of concept application of our counting methods to the task of Visual Question Answering, by studying the `how many?' questions in the VQA and COCO-QA datasets.

Results

TaskDatasetMetricValueModel
Object CountingPASCAL VOCmRMSE0.42CEOES
Object CountingPascal VOC 2007 count-testm-reIRMSE-nz0.65ens
Object CountingPascal VOC 2007 count-testm-relRMSE0.2ens
Object CountingPascal VOC 2007 count-testmRMSE0.42ens
Object CountingPascal VOC 2007 count-testmRMSE-nz1.68ens
Object CountingPascal VOC 2007 count-testm-reIRMSE-nz0.68Seq-sub-ft-3x3
Object CountingPascal VOC 2007 count-testm-relRMSE0.22Seq-sub-ft-3x3
Object CountingPascal VOC 2007 count-testmRMSE0.43Seq-sub-ft-3x3
Object CountingPascal VOC 2007 count-testmRMSE-nz1.65Seq-sub-ft-3x3
Object CountingPascal VOC 2007 count-testm-reIRMSE-nz0.73glance-noft-2L
Object CountingPascal VOC 2007 count-testm-relRMSE0.27glance-noft-2L
Object CountingPascal VOC 2007 count-testmRMSE0.5glance-noft-2L
Object CountingPascal VOC 2007 count-testmRMSE-nz1.83glance-noft-2L
Object CountingPascal VOC 2007 count-testm-reIRMSE-nz0.85Fast-RCNN
Object CountingPascal VOC 2007 count-testm-relRMSE0.26Fast-RCNN
Object CountingPascal VOC 2007 count-testmRMSE0.5Fast-RCNN
Object CountingPascal VOC 2007 count-testmRMSE-nz1.92Fast-RCNN
Object CountingCOCO count-testm-reIRMSE0.18ens
Object CountingCOCO count-testm-reIRMSE-nz0.81ens
Object CountingCOCO count-testmRMSE0.36ens
Object CountingCOCO count-testmRMSE-nz1.98ens
Object CountingCOCO count-testm-reIRMSE0.18Seq-sub-ft-3x3
Object CountingCOCO count-testm-reIRMSE-nz0.82Seq-sub-ft-3x3
Object CountingCOCO count-testmRMSE0.35Seq-sub-ft-3x3
Object CountingCOCO count-testmRMSE-nz1.96Seq-sub-ft-3x3
Object CountingCOCO count-testm-reIRMSE0.2Fast-RCNN
Object CountingCOCO count-testm-reIRMSE-nz1.13Fast-RCNN
Object CountingCOCO count-testmRMSE0.49Fast-RCNN
Object CountingCOCO count-testmRMSE-nz2.78Fast-RCNN
Object CountingCOCO count-testm-reIRMSE0.23glance-ft-2L
Object CountingCOCO count-testm-reIRMSE-nz0.91glance-ft-2L
Object CountingCOCO count-testmRMSE0.42glance-ft-2L
Object CountingCOCO count-testmRMSE-nz2.25glance-ft-2L
Object CountingCOCO count-testm-reIRMSE0.24Aso-sub-ft-3x3
Object CountingCOCO count-testm-reIRMSE-nz0.87Aso-sub-ft-3x3
Object CountingCOCO count-testmRMSE0.38Aso-sub-ft-3x3
Object CountingCOCO count-testmRMSE-nz2.08Aso-sub-ft-3x3

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17