CoordViT
Reported on 2 benchmarks across 2 tasks
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Audio1 result
- Accuracy82.96best: 94.07 (Vertically long patch ViT)
Speech1 result
- Accuracy82.96best: 94.07 (Vertically long patch ViT)