Transformer local-attention (NesT-B)

Reported on 3 benchmarks across 1 task · 1 paper

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision3 results

Image ClassificationonCIFAR-10
Percentage correct· 2021-05-26
97.2
best: 99.5 (ViT-H/14)
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding arXiv:2105.12723
Image ClassificationonCIFAR-100
Percentage correct· uses extra data· 2021-05-26
82.56
best: 96.08 (EffNet-L2 (SAM))
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding arXiv:2105.12723
Image ClassificationonImageNet
GFLOPs· 2021-05-26
17.9
best: 1478 (InternImage-H)
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding arXiv:2105.12723