Metric: Video hit@5 (higher is better)
| # | Model↕ | Video hit@5▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | ip-CSN-152 (RGB) | 92.8 | No | Video Classification with Channel-Separated Conv... | 2019-04-04 | Code |
| 2 | ip-CSN-101 (RGB) | 92.6 | No | Video Classification with Channel-Separated Conv... | 2019-04-04 | Code |
| 3 | G-Blend | 92.4 | No | What Makes Training Multi-Modal Classification N... | 2019-05-29 | Code |
| 4 | R[2+1]D-Two-Stream-32frame | 91.9 | No | A Closer Look at Spatiotemporal Convolutions for... | 2017-11-30 | Code |
| 5 | R[2+1]D-RGB-32frame | 91.5 | No | A Closer Look at Spatiotemporal Convolutions for... | 2017-11-30 | Code |
| 6 | Conv pooling | 90.4 | No | Beyond Short Snippets: Deep Networks for Video C... | 2015-03-31 | Code |
| 7 | R[2+1]D-Flow-32frame | 88.7 | No | A Closer Look at Spatiotemporal Convolutions for... | 2017-11-30 | Code |
| 8 | P3D | 87.4 | No | Learning Spatio-Temporal Representation with Pse... | 2017-11-28 | Code |
| 9 | LSTM +Pretrained on YT-8M | 86.2 | No | YouTube-8M: A Large-Scale Video Classification B... | 2016-09-27 | Code |
| 10 | C3D | 85.5 | No | Learning Spatiotemporal Features with 3D Convolu... | 2014-12-02 | Code |
| 11 | DeepVideo’s Slow Fusion | 80.2 | No | - | - | Code |