Description
Charformer is a type of Transformer model that learns a subword tokenization end-to-end as part of the model. Specifically it uses GBST that automatically learns latent subword representations from characters in a data-driven fashion. Following GBST, the soft subword sequence is passed through Transformer layers.
Papers Using This Method
CharFormer: A Glyph Fusion based Attentive Framework for High-precision Character Image Denoising2022-07-16Patching Leaks in the Charformer for Efficient Character-Level Generation2022-05-27A New Generation of Perspective API: Efficient Multilingual Character-level Transformers2022-02-22Patching Leaks in the Charformer for Generative Tasks2022-01-16Charformer: Fast Character Transformers via Gradient-based Subword Tokenization2021-06-23