TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/DiDeMo

DiDeMo

Distinct Describable Moments

TextsVideosBSD 2-ClauseIntroduced 2017-01-01

The Distinct Describable Moments (DiDeMo) dataset is one of the largest and most diverse datasets for the temporal localization of events in videos given natural language descriptions. The videos are collected from Flickr and each video is trimmed to a maximum of 30 seconds. The videos in the dataset are divided into 5-second segments to reduce the complexity of annotation. The dataset is split into training, validation and test sets containing 8,395, 1,065 and 1,004 videos respectively. The dataset contains a total of 26,892 moments and one moment could be associated with descriptions from multiple annotators. The descriptions in DiDeMo dataset are detailed and contain camera movement, temporal transition indicators, and activities. Moreover, the descriptions in DiDeMo are verified so that each description refers to a single moment.

Source: Weakly Supervised Video Moment Retrieval From Text Queries Image Source: https://www.di.ens.fr/~miech/datasetviz/

Benchmarks

Video/text-to-video R@1Video/text-to-video R@5Video/text-to-video R@10Video/text-to-video R@50Video/text-to-video Median RankVideo/text-to-video Mean RankVideo/video-to-text R@1Video/video-to-text R@5Video/video-to-text R@10Video/video-to-text Median RankVideo/video-to-text Mean RankVideo/text-to-videoR@1Video/R@1,IoU=0.5Video/R@1,IoU=0.7Video/R@1,IoU=1.0Video/R@5,IoU=0.5Video/R@5,IoU=0.7Video/R@5,IoU=1.0Video Retrieval/text-to-video R@1Video Retrieval/text-to-video R@5Video Retrieval/text-to-video R@10Video Retrieval/text-to-video R@50Video Retrieval/text-to-video Median RankVideo Retrieval/text-to-video Mean RankVideo Retrieval/video-to-text R@1Video Retrieval/video-to-text R@5Video Retrieval/video-to-text R@10Video Retrieval/video-to-text Median RankVideo Retrieval/video-to-text Mean RankVideo Retrieval/text-to-videoR@1Zero-Shot Video Retrieval/text-to-video R@1Zero-Shot Video Retrieval/text-to-video R@5Zero-Shot Video Retrieval/text-to-video R@10Zero-Shot Video Retrieval/video-to-text R@1Zero-Shot Video Retrieval/video-to-text R@5Zero-Shot Video Retrieval/video-to-text R@10Zero-Shot Video Retrieval/text-to-video Median RankZero-Shot Video Retrieval/video-to-text Median Rank

Statistics

Papers
216
Benchmarks
38

Links

Homepage

Tasks

Natural Language Moment RetrievalVideoVideo RetrievalZero-Shot Video Retrieval