TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Poses of People in Art: A Data Set for Human Pose Estimati...

Poses of People in Art: A Data Set for Human Pose Estimation in Digital Art History

Stefanie Schneider, Ricarda Vollmer

2023-01-122D Human Pose EstimationPose EstimationMulti-Person Pose Estimation2D Object DetectionObject Detection
PaperPDF

Abstract

Throughout the history of art, the pose, as the holistic abstraction of the human body's expression, has proven to be a constant in numerous studies. However, due to the enormous amount of data that so far had to be processed by hand, its crucial role to the formulaic recapitulation of art-historical motifs since antiquity could only be highlighted selectively. This is true even for the now automated estimation of human poses, as domain-specific, sufficiently large data sets required for training computational models are either not publicly available or not indexed at a fine enough granularity. With the Poses of People in Art data set, we introduce the first openly licensed data set for estimating human poses in art and validating human pose estimators. It consists of 2,454 images from 22 art-historical depiction styles, including those that have increasingly turned away from lifelike representations of the body since the 19th century. A total of 10,749 human figures are precisely enclosed by rectangular bounding boxes, with a maximum of four per image labeled by up to 17 keypoints; among these are mainly joints such as elbows and knees. For machine learning purposes, the data set is divided into three subsets, training, validation, and testing, that follow the established JSON-based Microsoft COCO format, respectively. Each image annotation, in addition to mandatory fields, provides metadata from the art-historical online encyclopedia WikiArt. With this paper, we elaborate on the acquisition and constitution of the data set, address various application scenarios, and discuss prospects for a digitally supported art history. We show that the data set enables the investigation of body phenomena in art, whether at the level of individual figures, which can be captured in their subtleties, or entire figure constellations, whose position, distance, or proximity to one another is considered.

Results

TaskDatasetMetricValueModel
Object DetectionPeopleArtmAP49.7PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object DetectionPeopleArtmAP@0.580.5PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object DetectionPeopleArtmAP@0.7551.8PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object DetectionPeopleArtmAP47.8TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object DetectionPeopleArtmAP@0.578TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object DetectionPeopleArtmAP@0.7549.9TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object DetectionPeopleArtmAP46.5PVT (Pyramid Vision Transformer)
Object DetectionPeopleArtmAP@0.576PVT (Pyramid Vision Transformer)
Object DetectionPeopleArtmAP@0.7548.4PVT (Pyramid Vision Transformer)
Object DetectionPeopleArtmAP46.1TOOD (Task-aligned One-stage Object Detection)
Object DetectionPeopleArtmAP@0.575TOOD (Task-aligned One-stage Object Detection)
Object DetectionPeopleArtmAP@0.7549TOOD (Task-aligned One-stage Object Detection)
3DPeopleArtmAP49.7PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3DPeopleArtmAP@0.580.5PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3DPeopleArtmAP@0.7551.8PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3DPeopleArtmAP47.8TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3DPeopleArtmAP@0.578TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3DPeopleArtmAP@0.7549.9TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3DPeopleArtmAP46.5PVT (Pyramid Vision Transformer)
3DPeopleArtmAP@0.576PVT (Pyramid Vision Transformer)
3DPeopleArtmAP@0.7548.4PVT (Pyramid Vision Transformer)
3DPeopleArtmAP46.1TOOD (Task-aligned One-stage Object Detection)
3DPeopleArtmAP@0.575TOOD (Task-aligned One-stage Object Detection)
3DPeopleArtmAP@0.7549TOOD (Task-aligned One-stage Object Detection)
2D ClassificationPeopleArtmAP49.7PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D ClassificationPeopleArtmAP@0.580.5PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D ClassificationPeopleArtmAP@0.7551.8PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D ClassificationPeopleArtmAP47.8TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D ClassificationPeopleArtmAP@0.578TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D ClassificationPeopleArtmAP@0.7549.9TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D ClassificationPeopleArtmAP46.5PVT (Pyramid Vision Transformer)
2D ClassificationPeopleArtmAP@0.576PVT (Pyramid Vision Transformer)
2D ClassificationPeopleArtmAP@0.7548.4PVT (Pyramid Vision Transformer)
2D ClassificationPeopleArtmAP46.1TOOD (Task-aligned One-stage Object Detection)
2D ClassificationPeopleArtmAP@0.575TOOD (Task-aligned One-stage Object Detection)
2D ClassificationPeopleArtmAP@0.7549TOOD (Task-aligned One-stage Object Detection)
2D Object DetectionPeopleArtmAP49.7PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object DetectionPeopleArtmAP@0.580.5PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object DetectionPeopleArtmAP@0.7551.8PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object DetectionPeopleArtmAP47.8TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object DetectionPeopleArtmAP@0.578TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object DetectionPeopleArtmAP@0.7549.9TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object DetectionPeopleArtmAP46.5PVT (Pyramid Vision Transformer)
2D Object DetectionPeopleArtmAP@0.576PVT (Pyramid Vision Transformer)
2D Object DetectionPeopleArtmAP@0.7548.4PVT (Pyramid Vision Transformer)
2D Object DetectionPeopleArtmAP46.1TOOD (Task-aligned One-stage Object Detection)
2D Object DetectionPeopleArtmAP@0.575TOOD (Task-aligned One-stage Object Detection)
2D Object DetectionPeopleArtmAP@0.7549TOOD (Task-aligned One-stage Object Detection)
16kPeopleArtmAP49.7PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16kPeopleArtmAP@0.580.5PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16kPeopleArtmAP@0.7551.8PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16kPeopleArtmAP47.8TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16kPeopleArtmAP@0.578TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16kPeopleArtmAP@0.7549.9TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16kPeopleArtmAP46.5PVT (Pyramid Vision Transformer)
16kPeopleArtmAP@0.576PVT (Pyramid Vision Transformer)
16kPeopleArtmAP@0.7548.4PVT (Pyramid Vision Transformer)
16kPeopleArtmAP46.1TOOD (Task-aligned One-stage Object Detection)
16kPeopleArtmAP@0.575TOOD (Task-aligned One-stage Object Detection)
16kPeopleArtmAP@0.7549TOOD (Task-aligned One-stage Object Detection)

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17