DailyMoth-70h

TextsVideosCC BY-NC 4.0Introduced 2024-02-14

DailyMoth-70h is a fully self-contained ASL-to-English sign language dataset containing over 70h of video (48K clips) with aligned English captions of a single native ASL signer (white, male, and early middle-aged) from the ASL news channel TheDailyMoth. The primary purpose of the dataset is to be used as a benchmark and analysis dataset for (gloss-free) sign language translation.

The dataset comes with four parts:

  • raw_videos: Contains the unsegmented DailyMoth videos (496 in total) with blurring applied to the burnt-in captions and advertisement breaks and news headline banners

  • blurred_clips: Contains the segmented video clips (48386 in total) with facial blurring applied. Each clip comes in its native frame rate (either 24, 29.97 or 30 fps) and as 224x224px region-of-interest (ROI) crops around the signer

  • unblurred_clips: Contains the unblurred segmented video clips (48386 in total). Each clip comes in its native frame rate (either 24, 29.97 or 30 fps) and as 224x224px region-of-interest (ROI) crops around the signer

  • manifests: Contains the manifest TSV files for training, validation, and testing. Also contains a combined manifest file and a tsv file with the start and end timestamps used to segment the raw video

Detailed dataset statistics are listed in https://arxiv.org/abs/2402.09611.

The dataset is available for download at https://github.com/facebookresearch/ssvp_slt?tab=readme-ov-file#dailymoth-70h.