Michael Hobley, Victor Prisacariu
Current class-agnostic counting methods can generalise to unseen classes but usually require reference images to define the type of object to be counted, as well as instance annotations during training. Reference-less class-agnostic counting is an emerging field that identifies counting as, at its core, a repetition-recognition task. Such methods facilitate counting on a changing set composition. We show that a general feature space with global context can enumerate instances in an image without a prior on the object type present. Specifically, we demonstrate that regression from vision transformer features without point-level supervision or reference images is superior to other reference-less methods and is competitive with methods that use reference images. We show this on the current standard few-shot counting dataset FSC-147. We also propose an improved dataset, FSC-133, which removes errors, ambiguities, and repeated images from FSC-147 and demonstrate similar performance on it. To the best of our knowledge, we are the first weakly-supervised reference-less class-agnostic counting method.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Counting | FSC147 | MAE(test) | 17.12 | RCC |
| Object Counting | FSC147 | MAE(val) | 17.49 | RCC |
| Object Counting | FSC147 | RMSE(test) | 104.53 | RCC |
| Object Counting | FSC147 | RMSE(val) | 58.81 | RCC |
| Object Counting | FSC147 | MAE(test) | 17.12 | RCC |
| Object Counting | FSC147 | MAE(val) | 17.49 | RCC |
| Object Counting | FSC147 | RMSE(test) | 104.53 | RCC |
| Object Counting | FSC147 | RMSE(val) | 58.81 | RCC |