TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/MuLD

MuLD

Multitask Long Document Benchmark

TextsCustomIntroduced 2022-02-15

MuLD (Multitask Long Document Benchmark) is a set of 6 NLP tasks where the inputs consist of at least 10,000 words. The benchmark covers a wide variety of task types including translation, summarization, question answering, and classification. Additionally there is a range of output lengths from a single word classification label all the way up to an output longer than the input text.

Related Benchmarks

MuLD (Character Type)/Classification/F1MuLD (Character Type)/Text Classification/F1MuLD (HotpotQA)/Question Answering/BLEU-1MuLD (HotpotQA)/Question Answering/BLEU-4MuLD (HotpotQA)/Question Answering/METEORMuLD (HotpotQA)/Question Answering/Rouge-LMuLD (NarrativeQA)/Question Answering/BLEU-1MuLD (NarrativeQA)/Question Answering/BLEU-4MuLD (NarrativeQA)/Question Answering/METEORMuLD (NarrativeQA)/Question Answering/Rouge-LMuLD (OpenSubtitles)/Translation/BLEU-1MuLD (OpenSubtitles)/Translation/BLEU-4MuLD (OpenSubtitles)/Translation/METEORMuLD (OpenSubtitles)/Translation/Rouge-LMuLD (VLSP)/Summarization/BLEU-1MuLD (VLSP)/Summarization/BLEU-4MuLD (VLSP)/Summarization/METEORMuLD (VLSP)/Summarization/Rouge-L

Statistics

Papers
3
Benchmarks
0

Links

Homepage

Tasks

Long-range modelingNatural Language UnderstandingQuestion AnsweringStyle change detectionSummarizationText ClassificationTranslation