Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

MuLD

Multitask Long Document Benchmark

TextsCustomIntroduced 2022-02-15

MuLD (Multitask Long Document Benchmark) is a set of 6 NLP tasks where the inputs consist of at least 10,000 words. The benchmark covers a wide variety of task types including translation, summarization, question answering, and classification. Additionally there is a range of output lengths from a single word classification label all the way up to an output longer than the input text.

Related Benchmarks

MuLD (Character Type)/Classification/F1 MuLD (Character Type)/Text Classification/F1 MuLD (HotpotQA)/Question Answering/BLEU-1 MuLD (HotpotQA)/Question Answering/BLEU-4 MuLD (HotpotQA)/Question Answering/METEOR MuLD (HotpotQA)/Question Answering/Rouge-L MuLD (NarrativeQA)/Question Answering/BLEU-1 MuLD (NarrativeQA)/Question Answering/BLEU-4 MuLD (NarrativeQA)/Question Answering/METEOR MuLD (NarrativeQA)/Question Answering/Rouge-L MuLD (OpenSubtitles)/Translation/BLEU-1 MuLD (OpenSubtitles)/Translation/BLEU-4 MuLD (OpenSubtitles)/Translation/METEOR MuLD (OpenSubtitles)/Translation/Rouge-L MuLD (VLSP)/Summarization/BLEU-1 MuLD (VLSP)/Summarization/BLEU-4 MuLD (VLSP)/Summarization/METEOR MuLD (VLSP)/Summarization/Rouge-L

Statistics

Papers: 3
Benchmarks: 0

Links

Tasks

Long-range modeling Natural Language Understanding Question Answering Style change detection Summarization Text Classification Translation