Description
DeepViT is a type of vision transformer that replaces the self-attention layer within the transformer block with a Re-attention module to address the issue of attention collapse and enables training deeper ViTs.
DeepViT is a type of vision transformer that replaces the self-attention layer within the transformer block with a Re-attention module to address the issue of attention collapse and enables training deeper ViTs.