Metric: Video-mAP 0.2 (higher is better)
| # | Model↕ | Video-mAP 0.2▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | HIT | 88.8 | No | Holistic Interaction Transformer Network for Act... | 2022-10-23 | Code |
| 2 | STAR/L | 88 | Yes | End-to-End Spatio-Temporal Action Localisation w... | 2023-04-24 | - |
| 3 | HISAN (ResNet-101 + FPN) | 82.3 | No | - | - | - |
| 4 | MOC | 81.8 | No | Actions as Moving Points | 2020-01-14 | Code |
| 5 | HISAN (VGG-16) | 80.42 | No | - | - | - |
| 6 | YOWO + LFB | 78.6 | No | You Only Watch Once: A Unified CNN Architecture ... | 2019-11-15 | Code |
| 7 | Two-in-one Two Stream | 78.48 | No | Dance with Flow: Two-in-One Stream Action Detect... | 2019-04-01 | Code |
| 8 | TACNet | 77.5 | No | TACNet: Transition-Aware Context Network for Spa... | 2019-05-31 | - |
| 9 | STEP | 76.6 | No | STEP: Spatio-Temporal Progressive Learning for V... | 2019-04-19 | Code |
| 10 | YOWO | 75.8 | No | You Only Watch Once: A Unified CNN Architecture ... | 2019-11-15 | Code |
| 11 | Two-in-one | 75.48 | No | Dance with Flow: Two-in-One Stream Action Detect... | 2019-04-01 | Code |
| 12 | T-CNN | 47.1 | No | Tube Convolutional Neural Network (T-CNN) for Ac... | 2017-03-30 | Code |