Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Speech/Dialogue

Dialogue

183 benchmarks0 papers

Dialogue is notoriously hard to evaluate. Past approaches have used human evaluation.

Benchmarks

Dialogue on Visual Dialog v1.0 test-std

NDCG (x 100)MRR (x 100)R@1 R@5 Mean R@10

Dialogue on VisDial v0.9 val

R@1 R@10 R@5 Mean Rank MRR

Dialogue on Fluent Speech Commands

Dialogue on Switchboard corpus

Dialogue on KVRET

Dialogue on Wizard-of-Oz

Dialogue on CoSQL

question match accuracy interaction match accuracy

Dialogue on LRE07

3 sec 10 sec 30 sec Average

Dialogue on ICSI Meeting Recorder Dialog Act (MRDA) corpus

Dialogue on Second dialogue state tracking challenge

Joint Area Food Price Request

Dialogue on Snips-SmartLights

Dialogue on MULTIWOZ 2.0

MultiWOZ (Success)MultiWOZ (Inform)BLEU BLEU-4 Score

Dialogue on Persona-Chat

Avg F1 BLEU-1 BLEU-2 Distinct-1 Distinct-2 CIDr METEOR ROUGE-L

Dialogue on SIMMC2.0

Dialogue on Snips-SmartSpeaker

Accuracy-EN (%)Accuracy-FR (%)

Dialogue on VoxForge European

Dialogue on YouTube News dataset (No Noise)

Accuracy F1 Score

Dialogue on YouTube News dataset (White Noise)

Accuracy F1 Score

Dialogue on irc-disentanglement

Dialogue on rt-inod-jailbreaking

Dialogue on MULTIWOZ 2.1

BLEU MultiWOZ (Inform)MultiWOZ (Success)Joint Acc MultiWOZ (Joint Goal Acc)

Dialogue on OpenViDial 2.0

BLEU Dis-1 Dis-2 Dis-3 Dis-4

Dialogue on Spoken-SQuAD

Dialogue on VoxForge Commonwealth

Dialogue on DeliData

Dialogue on IndicTTS

Classification Accuracy

Dialogue on Linux IRC (Ch2 Elsner)

1-1 Local Shen F-1

Dialogue on Linux IRC (Ch2 Kummerfeld)

1-1 Local Shen F-1

Dialogue on Timers and Such

Dialogue on VoxForge

Dialogue on EmpatheticDialogues

BLEU BLEU-4 F1 ROUGE-L

Dialogue on FusedChat

Slot Accuracy Joint SA Inform Inform_mct Success Success_mct BLEU PPL Sensibleness Specificity SSA

Dialogue on Harry Potter Dialogue Dataset

mauve Recall 10@1

Dialogue on KALAKA-3

Dialogue on MMConv

Categorical Accuracy Non-Categorical Accuracy Overall

Dialogue on MULTIWOZ 2.2

MultiWOZ (Joint Goal Acc)

Dialogue on SGD

Dialogue on VOXLINGUA107

0..5sec 5..20sec Average

Dialogue on VisDial v1.0 test-std

MRR Mean Rank NDCG R@1 R@10 R@5

Dialogue on YouTube News dataset (Background Music)

Accuracy F1 Score

Dialogue on YouTube News dataset (Crackling Noise)

Accuracy F1 Score

Dialogue on ABCD

In-domain EM In-domain CE Cross-domain EM Cross-domain CE

Dialogue on Amazon-5

Dialogue on BlendedSkillTalk

BLEU-4 F1 ROUGE-L

Dialogue on CMU-DoG

F1 Meteor ROUGE-1 Rouge-L

Dialogue on ConvAI2

BLEU-4 F1 ROUGE-L

Dialogue on DSTC9 Track 3 - Task 2

Overall Human Rating Coherent Error Recovery Consistent Diversity Topic Depth Likeable Understanding Flexible Informative Inquisitive

Dialogue on EMOTyDA

Dialogue on GIF Reply Dataset

Dialogue on Image-Chat

BLEU-4 F1 ROUGE-L

Dialogue on Kvret

Entity F1 BLEU Embedding Average Greedy Matching Vector Extrema

Dialogue on PG-19

Dialogue on ProsocialDialog

Dialogue on Reddit (multi-ref)

interest (human)relevance (human)

Dialogue on SSD_NAME

Dialogue Success Rate Joint Acc Slot Acc

Dialogue on Switchboard Dialog Act Corpus

Dialogue on Switchboard dialogue act corpus

Dialogue on Twitter Dialogue (Noun)

F1 Precision Recall

Dialogue on Ubuntu Dialogue (Activity)

F1 Precision Recall

Dialogue on Ubuntu Dialogue (Entity)

F1 Precision Recall

Dialogue on Wizard of Wikipedia

BLEU-4 F1 ROUGE-L

Dialogue on automata

Dialogue Success Rate

Dialogue on Twitter Dialogue (Tense)

Dialogue on Ubuntu Dialogue (Cmd)

Dialogue on Ubuntu Dialogue (Tense)

Dialogue on Untranscribed mixed-speech dataset