Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/EPIC-KITCHENS-100

EPIC-KITCHENS-100

TextsVideosCC BY NC 4.0Introduced 2020-06-23

This paper introduces the pipeline to scale the largest dataset in egocentric vision EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (EPIC-KITCHENS-55), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection also enables evaluating the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected under the same hypotheses albeit "two years on". The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.

Benchmarks

2D Human Pose Estimation/Recall@5 2D Human Pose Estimation/Top-5 Verb 2D Human Pose Estimation/Top-5 Noun Action Anticipation/Recall@5 Action Anticipation/Top-5 Verb Action Anticipation/Top-5 Noun Action Localization/Avg mAP (0.1-0.5)Action Localization/mAP IOU@0.1 Action Localization/mAP IOU@0.2 Action Localization/mAP IOU@0.3 Action Localization/mAP IOU@0.4 Action Localization/mAP IOU@0.5 Action Recognition/Action@1 Action Recognition/Verb@1 Action Recognition/Noun@1 Action Recognition/GFLOPs Action Recognition/Recall@5 Action Recognition/Top-5 Verb Action Recognition/Top-5 Noun Action Recognition/HM Action Recognition In Videos/Recall@5 Action Recognition In Videos/Top-5 Verb Action Recognition In Videos/Top-5 Noun Activity Recognition/Action@1 Activity Recognition/Verb@1 Activity Recognition/Noun@1 Activity Recognition/GFLOPs Activity Recognition/Recall@5 Activity Recognition/Top-5 Verb Activity Recognition/Top-5 Noun Activity Recognition/HM Audio Classification/Top-1 Action Audio Classification/Top-1 Noun Audio Classification/Top-1 Verb Audio Classification/Top-5 Action Audio Classification/Top-5 Noun Audio Classification/Top-5 Verb Classification/Top-1 Action Classification/Top-1 Noun Classification/Top-1 Verb Classification/Top-5 Action Classification/Top-5 Noun Classification/Top-5 Verb Domain Adaptation/Average Accuracy Temporal Action Localization/Avg mAP (0.1-0.5)Temporal Action Localization/mAP IOU@0.1 Temporal Action Localization/mAP IOU@0.2 Temporal Action Localization/mAP IOU@0.3 Temporal Action Localization/mAP IOU@0.4 Temporal Action Localization/mAP IOU@0.5 Unsupervised Domain Adaptation/Average Accuracy Video/Avg mAP (0.1-0.5)Video/mAP IOU@0.1 Video/mAP IOU@0.2 Video/mAP IOU@0.3 Video/mAP IOU@0.4 Video/mAP IOU@0.5 Zero-Shot Learning/Avg mAP (0.1-0.5)Zero-Shot Learning/mAP IOU@0.1 Zero-Shot Learning/mAP IOU@0.2 Zero-Shot Learning/mAP IOU@0.3 Zero-Shot Learning/mAP IOU@0.4 Zero-Shot Learning/mAP IOU@0.5

Related Benchmarks

EPIC-KITCHENS-100 (test)/2D Human Pose Estimation/recall@5 EPIC-KITCHENS-100 (test)/Action Anticipation/recall@5 EPIC-KITCHENS-100 (test)/Action Recognition/recall@5 EPIC-KITCHENS-100 (test)/Action Recognition In Videos/recall@5 EPIC-KITCHENS-100 (test)/Activity Recognition/recall@5

Statistics

Papers: 162
Benchmarks: 63

Links

Tasks

2D Human Pose Estimation Action Anticipation Action Localization Action Recognition Action Recognition In Videos Activity Recognition Audio Classification Classification Domain Adaptation Multi-Instance Retrieval Open Vocabulary Action Recognition Temporal Action Localization Unsupervised Domain Adaptation Video Zero-Shot Learning