Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/DiDeMo

DiDeMo

Distinct Describable Moments

TextsVideosBSD 2-ClauseIntroduced 2017-01-01

The Distinct Describable Moments (DiDeMo) dataset is one of the largest and most diverse datasets for the temporal localization of events in videos given natural language descriptions. The videos are collected from Flickr and each video is trimmed to a maximum of 30 seconds. The videos in the dataset are divided into 5-second segments to reduce the complexity of annotation. The dataset is split into training, validation and test sets containing 8,395, 1,065 and 1,004 videos respectively. The dataset contains a total of 26,892 moments and one moment could be associated with descriptions from multiple annotators. The descriptions in DiDeMo dataset are detailed and contain camera movement, temporal transition indicators, and activities. Moreover, the descriptions in DiDeMo are verified so that each description refers to a single moment.

Source: Weakly Supervised Video Moment Retrieval From Text Queries Image Source: https://www.di.ens.fr/~miech/datasetviz/

Benchmarks

Video/text-to-video R@1 Video/text-to-video R@5 Video/text-to-video R@10 Video/text-to-video R@50 Video/text-to-video Median Rank Video/text-to-video Mean Rank Video/video-to-text R@1 Video/video-to-text R@5 Video/video-to-text R@10 Video/video-to-text Median Rank Video/video-to-text Mean Rank Video/text-to-videoR@1 Video/R@1,IoU=0.5 Video/R@1,IoU=0.7 Video/R@1,IoU=1.0 Video/R@5,IoU=0.5 Video/R@5,IoU=0.7 Video/R@5,IoU=1.0 Video Retrieval/text-to-video R@1 Video Retrieval/text-to-video R@5 Video Retrieval/text-to-video R@10 Video Retrieval/text-to-video R@50 Video Retrieval/text-to-video Median Rank Video Retrieval/text-to-video Mean Rank Video Retrieval/video-to-text R@1 Video Retrieval/video-to-text R@5 Video Retrieval/video-to-text R@10 Video Retrieval/video-to-text Median Rank Video Retrieval/video-to-text Mean Rank Video Retrieval/text-to-videoR@1 Zero-Shot Video Retrieval/text-to-video R@1 Zero-Shot Video Retrieval/text-to-video R@5 Zero-Shot Video Retrieval/text-to-video R@10 Zero-Shot Video Retrieval/video-to-text R@1 Zero-Shot Video Retrieval/video-to-text R@5 Zero-Shot Video Retrieval/video-to-text R@10 Zero-Shot Video Retrieval/text-to-video Median Rank Zero-Shot Video Retrieval/video-to-text Median Rank

Statistics

Papers: 216
Benchmarks: 38

Links

Tasks

Natural Language Moment Retrieval Video Video Retrieval Zero-Shot Video Retrieval