TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Speech/Dialogue

Dialogue

183 benchmarks0 papers

Dialogue is notoriously hard to evaluate. Past approaches have used human evaluation.

Benchmarks

Dialogue on Visual Dialog v1.0 test-std

NDCG (x 100)MRR (x 100)R@1R@5MeanR@10

Dialogue on VisDial v0.9 val

R@1R@10R@5Mean RankMRR

Dialogue on Fluent Speech Commands

Accuracy (%)

Dialogue on Switchboard corpus

Accuracy

Dialogue on KVRET

Entity F1BLEU

Dialogue on Wizard-of-Oz

JointRequest

Dialogue on CoSQL

question match accuracyinteraction match accuracy

Dialogue on LRE07

3 sec10 sec30 secAverage

Dialogue on ICSI Meeting Recorder Dialog Act (MRDA) corpus

Accuracy

Dialogue on Second dialogue state tracking challenge

JointAreaFoodPriceRequest

Dialogue on Snips-SmartLights

Accuracy (%)

Dialogue on MULTIWOZ 2.0

MultiWOZ (Success)MultiWOZ (Inform)BLEUBLEU-4Score

Dialogue on Persona-Chat

Avg F1BLEU-1BLEU-2Distinct-1Distinct-2CIDrMETEORROUGE-L

Dialogue on SIMMC2.0

Act F1Slot F1

Dialogue on Snips-SmartSpeaker

Accuracy-EN (%)Accuracy-FR (%)

Dialogue on VoxForge European

Accuracy (%)

Dialogue on YouTube News dataset (No Noise)

Accuracy F1 Score

Dialogue on YouTube News dataset (White Noise)

Accuracy F1 Score

Dialogue on irc-disentanglement

VIPRF1-1

Dialogue on rt-inod-jailbreaking

Best-of

Dialogue on MULTIWOZ 2.1

BLEUMultiWOZ (Inform)MultiWOZ (Success)Joint AccMultiWOZ (Joint Goal Acc)

Dialogue on OpenViDial 2.0

BLEUDis-1Dis-2Dis-3Dis-4

Dialogue on Spoken-SQuAD

F1 score

Dialogue on VoxForge Commonwealth

Accuracy (%)

Dialogue on DeliData

AUC

Dialogue on IndicTTS

Classification Accuracy

Dialogue on Linux IRC (Ch2 Elsner)

1-1LocalShen F-1

Dialogue on Linux IRC (Ch2 Kummerfeld)

1-1LocalShen F-1

Dialogue on Timers and Such

Accuracy (%)

Dialogue on VoxForge

Accuracy

Dialogue on EmpatheticDialogues

BLEUBLEU-4F1ROUGE-L

Dialogue on FusedChat

Slot AccuracyJoint SAInformInform_mctSuccessSuccess_mctBLEUPPLSensiblenessSpecificitySSA

Dialogue on Harry Potter Dialogue Dataset

mauveRecall 10@1

Dialogue on KALAKA-3

PCECEOPO

Dialogue on MMConv

Categorical AccuracyNon-Categorical AccuracyOverall

Dialogue on MULTIWOZ 2.2

MultiWOZ (Joint Goal Acc)

Dialogue on SGD

METEOR

Dialogue on VOXLINGUA107

0..5sec5..20secAverage

Dialogue on VisDial v1.0 test-std

MRRMean RankNDCGR@1R@10R@5

Dialogue on YouTube News dataset (Background Music)

Accuracy F1 Score

Dialogue on YouTube News dataset (Crackling Noise)

Accuracy F1 Score

Dialogue on ABCD

In-domain EMIn-domain CECross-domain EMCross-domain CE

Dialogue on Amazon-5

1 in 10 R@2

Dialogue on BlendedSkillTalk

BLEU-4F1ROUGE-L

Dialogue on CMU-DoG

F1MeteorROUGE-1Rouge-L

Dialogue on ConvAI2

BLEU-4F1ROUGE-L

Dialogue on DSTC9 Track 3 - Task 2

Overall Human RatingCoherentError RecoveryConsistentDiversityTopic DepthLikeableUnderstandingFlexibleInformativeInquisitive

Dialogue on EMOTyDA

Accuracy

Dialogue on GIF Reply Dataset

nDCG@10

Dialogue on Image-Chat

BLEU-4F1ROUGE-L

Dialogue on Kvret

Entity F1BLEUEmbedding AverageGreedy MatchingVector Extrema

Dialogue on PG-19

Perplexity

Dialogue on ProsocialDialog

Accuracy

Dialogue on Reddit (multi-ref)

interest (human)relevance (human)

Dialogue on SSD_NAME

Dialogue Success RateJoint AccSlot Acc

Dialogue on Switchboard Dialog Act Corpus

Accuracy

Dialogue on Switchboard dialogue act corpus

Accuracy

Dialogue on Twitter Dialogue (Noun)

F1PrecisionRecall

Dialogue on Ubuntu Dialogue (Activity)

F1PrecisionRecall

Dialogue on Ubuntu Dialogue (Entity)

F1PrecisionRecall

Dialogue on Wizard of Wikipedia

BLEU-4F1ROUGE-L

Dialogue on automata

Dialogue Success Rate

Dialogue on Twitter Dialogue (Tense)

Accuracy

Dialogue on Ubuntu Dialogue (Cmd)

Accuracy

Dialogue on Ubuntu Dialogue (Tense)

Accuracy

Dialogue on Untranscribed mixed-speech dataset

ACCPRCRCL