ES³ Large + extLM

Reported on 2 benchmarks across 2 tasks

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision1 result

  • Word Error Rate (WER)· uses extra data
    24.6
    best: 14.6 (Auto-AVSR)

Natural Language Processing1 result