BinaryBERT

Natural Language ProcessingIntroduced 20001 papers

Description

BinaryBERT is a BERT-variant that applies quantization in the form of weight binarization. Specifically, ternary weight splitting is proposed which initializes BinaryBERT by equivalently splitting from a half-sized ternary network. To obtain BinaryBERT, we first train a half-sized ternary BERT model, and then apply a ternary weight splitting operator to obtain the latent full-precision and quantized weights as the initialization of the full-sized BinaryBERT. We then fine-tune BinaryBERT for further refinement.

Papers Using This Method