Single-Headed Attention

GeneralIntroduced 20006 papers