Poses of People in Art: A Data Set for Human Pose Estimation in Digital Art History

Stefanie Schneider, Ricarda Vollmer

2023-01-122D Human Pose Estimation Pose Estimation Multi-Person Pose Estimation 2D Object Detection Object Detection

Abstract

Throughout the history of art, the pose, as the holistic abstraction of the human body's expression, has proven to be a constant in numerous studies. However, due to the enormous amount of data that so far had to be processed by hand, its crucial role to the formulaic recapitulation of art-historical motifs since antiquity could only be highlighted selectively. This is true even for the now automated estimation of human poses, as domain-specific, sufficiently large data sets required for training computational models are either not publicly available or not indexed at a fine enough granularity. With the Poses of People in Art data set, we introduce the first openly licensed data set for estimating human poses in art and validating human pose estimators. It consists of 2,454 images from 22 art-historical depiction styles, including those that have increasingly turned away from lifelike representations of the body since the 19th century. A total of 10,749 human figures are precisely enclosed by rectangular bounding boxes, with a maximum of four per image labeled by up to 17 keypoints; among these are mainly joints such as elbows and knees. For machine learning purposes, the data set is divided into three subsets, training, validation, and testing, that follow the established JSON-based Microsoft COCO format, respectively. Each image annotation, in addition to mandatory fields, provides metadata from the art-historical online encyclopedia WikiArt. With this paper, we elaborate on the acquisition and constitution of the data set, address various application scenarios, and discuss prospects for a digitally supported art history. We show that the data set enables the investigation of body phenomena in art, whether at the level of individual figures, which can be captured in their subtleties, or entire figure constellations, whose position, distance, or proximity to one another is considered.

Results

Task	Dataset	Metric	Value	Model
Object Detection	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object Detection	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object Detection	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object Detection	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object Detection	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object Detection	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object Detection	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
Object Detection	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
Object Detection	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
Object Detection	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
Object Detection	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
Object Detection	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)
3D	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3D	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3D	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3D	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3D	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3D	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3D	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
3D	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
3D	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
3D	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
3D	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
3D	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)
2D Classification	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Classification	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Classification	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Classification	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Classification	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Classification	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Classification	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
2D Classification	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
2D Classification	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
2D Classification	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
2D Classification	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
2D Classification	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)
2D Object Detection	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object Detection	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object Detection	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object Detection	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object Detection	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object Detection	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object Detection	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
2D Object Detection	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
2D Object Detection	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
2D Object Detection	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
2D Object Detection	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
2D Object Detection	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)
16k	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16k	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16k	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16k	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16k	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16k	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16k	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
16k	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
16k	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
16k	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
16k	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
16k	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)

Abstract

Results

Task	Dataset	Metric	Value	Model
Object Detection	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object Detection	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object Detection	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
Object Detection	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object Detection	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object Detection	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
Object Detection	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
Object Detection	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
Object Detection	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
Object Detection	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
Object Detection	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
Object Detection	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)
3D	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3D	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3D	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
3D	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3D	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3D	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
3D	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
3D	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
3D	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
3D	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
3D	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
3D	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)
2D Classification	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Classification	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Classification	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Classification	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Classification	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Classification	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Classification	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
2D Classification	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
2D Classification	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
2D Classification	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
2D Classification	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
2D Classification	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)
2D Object Detection	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object Detection	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object Detection	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
2D Object Detection	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object Detection	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object Detection	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
2D Object Detection	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
2D Object Detection	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
2D Object Detection	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
2D Object Detection	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
2D Object Detection	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
2D Object Detection	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)
16k	PeopleArt	mAP	49.7	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16k	PeopleArt	mAP@0.5	80.5	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16k	PeopleArt	mAP@0.75	51.8	PVT (Pyramid Vision Transformer; trained on PeopleArt and PopArt)
16k	PeopleArt	mAP	47.8	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16k	PeopleArt	mAP@0.5	78	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16k	PeopleArt	mAP@0.75	49.9	TOOD (Task-aligned One-stage Object Detection; trained on PeopleArt and PoPArt)
16k	PeopleArt	mAP	46.5	PVT (Pyramid Vision Transformer)
16k	PeopleArt	mAP@0.5	76	PVT (Pyramid Vision Transformer)
16k	PeopleArt	mAP@0.75	48.4	PVT (Pyramid Vision Transformer)
16k	PeopleArt	mAP	46.1	TOOD (Task-aligned One-stage Object Detection)
16k	PeopleArt	mAP@0.5	75	TOOD (Task-aligned One-stage Object Detection)
16k	PeopleArt	mAP@0.75	49	TOOD (Task-aligned One-stage Object Detection)

Poses of People in Art: A Data Set for Human Pose Estimation in Digital Art History

Abstract

Results

Related Papers

Poses of People in Art: A Data Set for Human Pose Estimation in Digital Art History

Abstract

Results

Related Papers