TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/Ego4D

Ego4D

VideosCustom

Ego4D is a massive-scale egocentric video dataset and benchmark suite. It offers 3,025 hours of daily life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 855 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards with consenting participants and robust de-identification procedures where relevant. Ego4D dramatically expands the volume of diverse egocentric video footage publicly available to the research community. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. Furthermore, a host of new benchmark challenges are presented, centered around understanding the first-person visual experience in the past (querying an episodic memory), present (analyzing hand-object manipulation, audio-visual conversation, and social interactions), and future (forecasting activities). By publicly sharing this massive annotated dataset and benchmark suite, the aim is to push the frontier of first-person perception.

Description from: Facebook AI

Paper: Ego4D: Around the World in 3,000 Hours of Egocentric Video

GitHub: https://github.com/EGO4D

Benchmarks

Future Hand Prediction/Disp(Total)Future Hand Prediction/M.Disp(Left)Future Hand Prediction/C.Disp(Left)Future Hand Prediction/M.Disp(Right)Future Hand Prediction/C.Disp(Right)Natural Language Queries/R@1 Mean(0.3 and 0.5)Natural Language Queries/R@1 IoU=0.3Natural Language Queries/R@1 IoU=0.5Natural Language Queries/R@5 IoU=0.3Natural Language Queries/R@5 IoU=0.5Short-term Object Interaction Anticipation/Overall (Top5 mAP)Short-term Object Interaction Anticipation/Noun (Top5 mAP)Short-term Object Interaction Anticipation/Noun+Verb(Top5 mAP)Short-term Object Interaction Anticipation/Noun+TTC (Top5 mAP)State Change Object Detection/APState Change Object Detection/AP50State Change Object Detection/AP75

Related Benchmarks

Ego4D MQ test/Action Localization/Average mAPEgo4D MQ test/Action Localization/Recall@1x (tIoU=0.5)Ego4D MQ test/Temporal Action Localization/Average mAPEgo4D MQ test/Temporal Action Localization/Recall@1x (tIoU=0.5)Ego4D MQ test/Video/Average mAPEgo4D MQ test/Video/Recall@1x (tIoU=0.5)Ego4D MQ test/Zero-Shot Learning/Average mAPEgo4D MQ test/Zero-Shot Learning/Recall@1x (tIoU=0.5)Ego4D MQ val/Action Localization/Average mAPEgo4D MQ val/Action Localization/Recall@1x (tIoU=0.5)Ego4D MQ val/Temporal Action Localization/Average mAPEgo4D MQ val/Temporal Action Localization/Recall@1x (tIoU=0.5)Ego4D MQ val/Video/Average mAPEgo4D MQ val/Video/Recall@1x (tIoU=0.5)Ego4D MQ val/Zero-Shot Learning/Average mAPEgo4D MQ val/Zero-Shot Learning/Recall@1x (tIoU=0.5)Ego4D-Goalstep/Temporal Sentence Grounding/R@1,IoU=0.3Ego4D-Goalstep/Temporal Sentence Grounding/R@1,IoU=0.5Ego4D-Goalstep/Temporal Sentence Grounding/R@5,IoU=0.3Ego4D-Goalstep/Temporal Sentence Grounding/R@5,IoU=0.5Ego4D-Goalstep/Video/R@1,IoU=0.3Ego4D-Goalstep/Video/R@1,IoU=0.5Ego4D-Goalstep/Video/R@5,IoU=0.3Ego4D-Goalstep/Video/R@5,IoU=0.5Ego4D-Goalstep/Video Understanding/R@1,IoU=0.3Ego4D-Goalstep/Video Understanding/R@1,IoU=0.5Ego4D-Goalstep/Video Understanding/R@5,IoU=0.3Ego4D-Goalstep/Video Understanding/R@5,IoU=0.5

Statistics

Papers
32
Benchmarks
17

Links

Homepage

Tasks

Action AnticipationFuture Hand PredictionLong Term Action AnticipationMoment QueriesNatural Language QueriesObject State Change ClassificationShort-term Object Interaction AnticipationState Change Object DetectionTemporal Action Localization