TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/AVA

AVA

Atomic Visual Actions

VideosCC BY 4.0Introduced 2018-01-01

AVA is a project that provides audiovisual annotations of video for improving our understanding of human activity. Each of the video clips has been exhaustively annotated by human annotators, and together they represent a rich variety of scenes, recording conditions, and expressions of human activity. There are annotations for:

  • Kinetics (AVA-Kinetics) - a crossover between AVA and Kinetics. In order to provide localized action labels on a wider variety of visual scenes, authors provide AVA action labels on videos from Kinetics-700, nearly doubling the number of total annotations, and increasing the number of unique videos by over 500x.
  • Actions (AvA Actions) - the AVA dataset densely annotates 80 atomic visual actions in 430 15-minute movie clips, where actions are localized in space and time, resulting in 1.62M action labels with multiple labels per human occurring frequently.
  • Spoken Activity (AVA ActiveSpeaker, AVA Speech). AVA ActiveSpeaker: associates speaking activity with a visible face, on the AVA v1.0 videos, resulting in 3.65 million frames labeled across ~39K face tracks. AVA Speech densely annotates audio-based speech activity in AVA v1.0 videos, and explicitly labels 3 background noise conditions, resulting in ~46K labeled segments spanning 45 hours of data. Image Source: https://www.researchgate.net/profile/Paolo_Napoletano/publication/309327222/figure/fig1/AS:419620126248965@1477056642346/Sample-images-from-the-Aesthetic-Visual-Analysis-AVA-database-sorted-by-their-aesthetic.png

Benchmarks

Image Quality Assessment/AccuracyNode Classification/mAP

Related Benchmarks

AVA v2.1/Action Recognition/GFlopsAVA v2.1/Action Recognition/Params (M)AVA v2.1/Action Recognition/mAP (Val)AVA v2.1/Action Recognition In Videos/mAP (Val)AVA v2.1/Activity Recognition/GFlopsAVA v2.1/Activity Recognition/Params (M)AVA v2.1/Activity Recognition/mAP (Val)AVA v2.2/Action Recognition/mAPAVA v2.2/Action Recognition/mAP (Val)AVA v2.2/Action Recognition In Videos/mAP (Val)AVA v2.2/Activity Recognition/mAPAVA v2.2/Activity Recognition/mAP (Val)AVA-ActiveSpeaker/Action Detection/validation mean average precisionAVA-Kinetics/Action Localization/test mAPAVA-Kinetics/Action Localization/val mAPAVA-Speech/Activity Detection/ROC-AUCAvazu/Click-Through Rate Prediction/AUCAvazu/Click-Through Rate Prediction/LogLoss

Statistics

Papers
113
Benchmarks
2

Links

Homepage

Tasks

Action DetectionAction RecognitionAction Recognition In VideosActivity DetectionAesthetics Quality AssessmentAudio-Visual Active Speaker DetectionGaze EstimationImage Quality AssessmentNode ClassificationSelf-Supervised LearningSpatio-Temporal Action LocalizationSpeaker DiarizationSpeech EnhancementVideo Understanding