Augmented SBERT

Natural Language ProcessingIntroduced 20001 papers

Description

Augmented SBERT is a data augmentation strategy for pairwise sentence scoring that uses a BERT cross-encoder to improve the performance for the SBERT bi-encoders. Given a pre-trained, well-performing crossencoder, we sample sentence pairs according to a certain sampling strategy and label these using the cross-encoder. We call these weakly labeled examples the silver dataset and they will be merged with the gold training dataset. We then train the bi-encoder on this extended training dataset.

Papers Using This Method