TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/CREMA-D

CREMA-D

Audio

CREMA-D is an emotional multimodal actor data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified).

Actors spoke from a selection of 12 sentences. The sentences were presented using one of six different emotions (Anger, Disgust, Fear, Happy, Neutral, and Sad) and four different emotion levels (Low, Medium, High, and Unspecified).

Participants rated the emotion and emotion levels based on the combined audiovisual presentation, the video alone, and the audio alone. Due to the large number of ratings needed, this effort was crowd-sourced and a total of 2443 participants each rated 90 unique clips, 30 audio, 30 visual, and 30 audio-visual. 95% of the clips have more than 7 ratings.

Benchmarks

1 Image, 2*2 Stitchi/EmoAcc1 Image, 2*2 Stitchi/FID1 Image, 2*2 Stitchi/LSE-C10-shot image generation/EmoAcc10-shot image generation/FID10-shot image generation/LSE-C3D/UAR3D/EmoAcc3D/FID3D/LSE-C3D Face Modelling/UAR3D Face Modelling/EmoAcc3D Face Modelling/FID3D Face Modelling/LSE-C3D Face Reconstruction/UAR3D Face Reconstruction/EmoAcc3D Face Reconstruction/FID3D Face Reconstruction/LSE-CAudio Classification/AccuracyClassification/AccuracyEmotion Recognition/AccuracyEmotion Recognition/WARFace Generation/EmoAccFace Generation/FIDFace Generation/LSE-CFace Reconstruction/UARFace Reconstruction/EmoAccFace Reconstruction/FIDFace Reconstruction/LSE-CFacial Expression Recognition (FER)/UARFacial Recognition and Modelling/UARFacial Recognition and Modelling/EmoAccFacial Recognition and Modelling/FIDFacial Recognition and Modelling/LSE-CFew-Shot Learning/Top-1 Accuracy(5-Way-1-Shot)Image Generation/EmoAccImage Generation/FIDImage Generation/LSE-CMeta-Learning/Top-1 Accuracy(5-Way-1-Shot)Self-Supervised Learning/AccuracySpeech Emotion Recognition/AccuracyTalking Face Generation/EmoAccTalking Face Generation/FIDTalking Face Generation/LSE-C

Statistics

Papers
28
Benchmarks
44

Links

Homepage

Tasks

1 Image, 2*2 Stitchi10-shot image generation3D3D Face Modelling3D Face ReconstructionAudio ClassificationClassificationEmotion RecognitionFace GenerationFace ReconstructionFacial Expression Recognition (FER)Facial Recognition and ModellingFew-Shot Audio ClassificationFew-Shot LearningImage GenerationMeta-LearningSelf-Supervised LearningSpeech Emotion RecognitionTalking Face GenerationVideo Emotion Recognition