TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/GPT-4-Turbo-0125

GPT-4-Turbo-0125

Reported on 17 benchmarks across 1 task · 1 paper · 8 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing17 results

  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    12k· 2023-03-15
    52
    SOTA
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    16k· 2023-03-15
    44.5
    SOTA
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    32k· 2023-03-15
    30
    SOTA
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    6k· 2023-03-15
    63
    SOTA
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    8k· 2023-03-15
    56.5
    SOTA
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (TSort)
    16k· 2023-03-15
    5.5
    SOTA
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (TSort)
    4k· 2023-03-15
    16.5
    SOTA
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (TSort)
    8k· 2023-03-15
    8.5
    SOTA
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    1k· 2023-03-15
    73.5
    best: 74 (GPT-4-Turbo-1106)
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    2k· 2023-03-15
    73.5
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    4k· 2023-03-15
    65.5
    best: 67.5 (GPT-4-Turbo-1106)
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (TSort)
    128k· 2023-03-15
    2
    best: 6 (GPT-4-Turbo-1106)
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (TSort)
    2k· 2023-03-15
    15.5
    best: 18.5 (GPT-4-Turbo-1106)
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (TSort)
    32k· 2023-03-15
    2
    best: 6 (GPT-4-Turbo-1106)
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (TSort)
    64k· 2023-03-15
    4
    best: 6 (GPT-4-Turbo-1106)
    GPT-4 Technical ReportarXiv:2303.08774
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    128k
    0
  • Long-Context UnderstandingonAda-LEval (BestAnswer)
    64k
    0
    best: 0.5 (InternLM2-7b)