TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video/MiT

Video on MiT

Metric: Top 5 Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Top 5 Accuracy▼Extra DataPaperDate↕Code
1UMT-L (ViT-L/16)78.2YesUnmasked Teacher: Towards Training-Efficient Vid...2023-03-28Code
2UniFormerV2-L76.9Yes--Code
3MTV-H (WTS 60M)75.7YesMultiview Transformers for Video Recognition2022-01-12Code
4CoVeR(JFT-3B)75.4YesCo-training Transformer with Videos and Images I...2021-12-14-
5CoVeR(JFT-300M)73.9YesCo-training Transformer with Videos and Images I...2021-12-14-
6VATT-Large67.7YesVATT: Transformers for Multimodal Self-Supervise...2021-04-22Code
7VTN65.4YesVideo Transformer Network2021-02-01Code
8ViViT-L/16x264.9YesViViT: A Video Vision Transformer2021-03-29Code
9MBT (AV)61.2NoAttention Bottlenecks for Multimodal Fusion2021-06-30Code
10SRTG r3d-10158.49NoLearn to cycle: Time-consistent feature discover...2020-06-15Code
11SRTG r(2+1)d-5056.8NoLearn to cycle: Time-consistent feature discover...2020-06-15Code
12SRTG r3d-5055.65NoLearn to cycle: Time-consistent feature discover...2020-06-15Code
13SRTG r(2+1)d-3454.18NoLearn to cycle: Time-consistent feature discover...2020-06-15Code
14TRN-Multiscale53.87NoMoments in Time Dataset: one million videos for ...2018-01-09Code
15SRTG r3d-3452.35NoLearn to cycle: Time-consistent feature discover...2020-06-15Code