TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/Assembly101

Assembly101

VideosIntroduced 2022-03-28

Assembly101 is a new procedural activity dataset featuring 4321 videos of people assembling and disassembling 101 "take-apart" toy vehicles. Participants work without fixed instructions, and the sequences feature rich and natural variations in action ordering, mistakes, and corrections. Assembly101 is the first multi-view action dataset, with simultaneous static (8) and egocentric (4) recordings. Sequences are annotated with more than 100K coarse and 1M fine-grained action segments, and 18M 3D hand poses. We benchmark on three action understanding tasks: recognition, anticipation and temporal segmentation. Additionally, we propose a novel task of detecting mistakes. The unique recording format and rich set of annotations allow us to investigate generalization to new toys, cross-view transfer, long-tailed distributions, and pose vs. appearance. We envision that Assembly101 will serve as a new challenge to investigate various activity understanding problems.

Image Source: https://assembly-101.github.io/

Benchmarks

2D Human Pose Estimation/Verbs Recall@52D Human Pose Estimation/Objects Recall@52D Human Pose Estimation/Actions Recall@53D Action Recognition/Actions Top-13D Action Recognition/Verbs Top-13D Action Recognition/Object Top-1Action Anticipation/Verbs Recall@5Action Anticipation/Objects Recall@5Action Anticipation/Actions Recall@5Action Localization/Actions Top-1Action Localization/Verbs Top-1Action Localization/Object Top-1Action Localization/F1@10%Action Localization/F1@25%Action Localization/F1@50%Action Localization/EditAction Localization/MoFAction Recognition/Verbs Recall@5Action Recognition/Objects Recall@5Action Recognition/Actions Recall@5Action Recognition/Actions Top-1Action Recognition/Verbs Top-1Action Recognition/Object Top-1Action Recognition/HMAction Recognition In Videos/Verbs Recall@5Action Recognition In Videos/Objects Recall@5Action Recognition In Videos/Actions Recall@5Action Segmentation/F1@10%Action Segmentation/F1@25%Action Segmentation/F1@50%Action Segmentation/EditAction Segmentation/MoFActivity Recognition/Verbs Recall@5Activity Recognition/Objects Recall@5Activity Recognition/Actions Recall@5Activity Recognition/Actions Top-1Activity Recognition/Verbs Top-1Activity Recognition/Object Top-1Activity Recognition/HMTemporal Action Localization/Actions Top-1Temporal Action Localization/Verbs Top-1Temporal Action Localization/Object Top-1Video/Actions Top-1Video/Verbs Top-1Video/Object Top-1Zero-Shot Learning/Actions Top-1Zero-Shot Learning/Verbs Top-1Zero-Shot Learning/Object Top-1

Statistics

Papers
57
Benchmarks
48

Links

Homepage

Tasks

2D Human Pose Estimation3D Action RecognitionAction AnticipationAction LocalizationAction RecognitionAction Recognition In VideosAction SegmentationActivity RecognitionMistake DetectionOpen Vocabulary Action RecognitionTemporal Action LocalizationVideoZero-Shot Learning