Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | LTG-BERT-base 98M | 82.7 | No | Not all layers are equally as important: Every L... | 2023-11-03 | - |
| 2 | ELC-BERT-base 98M | 82.6 | No | Not all layers are equally as important: Every L... | 2023-11-03 | - |
| 3 | LTG-BERT-small 24M | 77.6 | No | Not all layers are equally as important: Every L... | 2023-11-03 | - |
| 4 | ELC-BERT-small 24M | 76.1 | No | Not all layers are equally as important: Every L... | 2023-11-03 | - |
| 5 | PSQ (Chen et al., 2020) | 67.5 | No | A Statistical Framework for Low-bitwidth Trainin... | 2020-10-27 | Code |
| 6 | Q-BERT (Shen et al., 2020) | 65.1 | No | Q-BERT: Hessian Based Ultra Low Precision Quanti... | 2019-09-12 | - |
| 7 | Q8BERT (Zafrir et al., 2019) | 65 | No | Q8BERT: Quantized 8Bit BERT | 2019-10-14 | Code |
| 8 | 24hBERT | 57.1 | No | How to Train BERT with an Academic Budget | 2021-04-15 | Code |