TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/VGT

VGT

Reported on 12 benchmarks across 2 tasks · 2 papers · 8 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision7 results

  • Document Layout AnalysisonD4LA
    mAP· 2023-08-29
    68.8
    best: 70.72 (DoPTA)
    SOTA
    Vision Grid Transformer for Document Layout AnalysisarXiv:2308.14978
  • Document Layout AnalysisonPubLayNet val
    List· 2023-08-29
    0.968
    best: 0.975 (TRDLU)
    SOTA
    Vision Grid Transformer for Document Layout AnalysisarXiv:2308.14978
  • Document Layout AnalysisonPubLayNet val
    Overall· 2023-08-29
    0.962
    SOTA
    Vision Grid Transformer for Document Layout AnalysisarXiv:2308.14978
  • Document Layout AnalysisonPubLayNet val
    Title· 2023-08-29
    0.939
    SOTA
    Vision Grid Transformer for Document Layout AnalysisarXiv:2308.14978
  • Document Layout AnalysisonPubLayNet val
    Figure· 2023-08-29
    0.971
    best: 0.975 (DETR)
    Vision Grid Transformer for Document Layout AnalysisarXiv:2308.14978
  • Document Layout AnalysisonPubLayNet val
    Table· 2023-08-29
    0.981
    Vision Grid Transformer for Document Layout AnalysisarXiv:2308.14978
  • Document Layout AnalysisonPubLayNet val
    Text· 2023-08-29
    0.95
    best: 0.967 (VSR)
    Vision Grid Transformer for Document Layout AnalysisarXiv:2308.14978

Reasoning5 results

  • Video Question AnsweringonIntentQA
    Accuarcy· 2022-07-12
    51.3
    best: 83.4 (VideoChat2_HD_mistral)
    SOTA
    Video Graph Transformer for Video Question AnsweringarXiv:2207.05342
  • Video Question AnsweringonIntentQA
    CH· 2022-07-12
    56
    best: 90 (VideoChat2_HD_mistral)
    SOTA
    Video Graph Transformer for Video Question AnsweringarXiv:2207.05342
  • Video Question AnsweringonIntentQA
    CW· 2022-07-12
    51.4
    best: 84 (VideoChat2_HD_mistral)
    SOTA
    Video Graph Transformer for Video Question AnsweringarXiv:2207.05342
  • Video Question AnsweringonIntentQA
    TP&TN· 2022-07-12
    47.6
    best: 79.1 (Human)
    SOTA
    Video Graph Transformer for Video Question AnsweringarXiv:2207.05342
  • Video Question AnsweringonNExT-QA
    Accuracy· 2022-07-12
    55
    best: 85.5 (LinVT-Qwen2-VL (7B))
    Video Graph Transformer for Video Question AnsweringarXiv:2207.05342