TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/ArcFace

ArcFace

Additive Angular Margin Loss

GeneralIntroduced 200094 papers
Source Paper

Description

ArcFace, or Additive Angular Margin Loss, is a loss function used in face recognition tasks. The softmax is traditionally used in these tasks. However, the softmax loss function does not explicitly optimise the feature embedding to enforce higher similarity for intraclass samples and diversity for inter-class samples, which results in a performance gap for deep face recognition under large intra-class appearance variations.

The ArcFace loss transforms the logits WT_jx_i=∣∣W_j∣∣ ∣∣x_i∣∣cos⁡θ_jW^{T}\_{j}x\_{i} = || W\_{j} || \text{ } || x\_{i} || \cos\theta\_{j}WT_jx_i=∣∣W_j∣∣ ∣∣x_i∣∣cosθ_j, where θ_j\theta\_{j}θ_j is the angle between the weight W_jW\_{j}W_j and the feature x_ix\_{i}x_i. The individual weight ∣∣W_j∣∣=1 || W\_{j} || = 1∣∣W_j∣∣=1 is fixed by l_2l\_{2}l_2 normalization. The embedding feature ∣∣x_i∣∣ ||x\_{i} ||∣∣x_i∣∣ is fixed by l_2l\_{2}l_2 normalization and re-scaled to sss. The normalisation step on features and weights makes the predictions only depend on the angle between the feature and the weight. The learned embedding features are thus distributed on a hypersphere with a radius of sss. Finally, an additive angular margin penalty mmm is added between x_ix\_{i}x_i and W_y_iW\_{y\_{i}}W_y_i to simultaneously enhance the intra-class compactness and inter-class discrepancy. Since the proposed additive angular margin penalty is equal to the geodesic distance margin penalty in the normalised hypersphere, the method is named ArcFace:

L_3=−1N∑N_i=1log⁡es(cos⁡(θ_y_i+m))es(cos⁡(θ_y_i+m))+∑n_j=1,j≠y_iescos⁡θ_jL\_{3} = -\frac{1}{N}\sum^{N}\_{i=1}\log\frac{e^{s\left(\cos\left(\theta\_{y\_{i}} + m\right)\right)}}{e^{s\left(\cos\left(\theta\_{y\_{i}} + m\right)\right)} + \sum^{n}\_{j=1, j \neq y\_{i}}e^{s\cos\theta\_{j}}}L_3=−N1​∑N_i=1loges(cos(θ_y_i+m))+∑n_j=1,j=y_iescosθ_jes(cos(θ_y_i+m))​

The authors select face images from 8 different identities containing enough samples (around 1,500 images/class) to train 2-D feature embedding networks with the softmax and ArcFace loss, respectively. As the Figure shows, the softmax loss provides roughly separable feature embedding but produces noticeable ambiguity in decision boundaries, while the proposed ArcFace loss can obviously enforce a more evident gap between the nearest classes.

Other alternatives to enforce intra-class compactness and inter-class distance include Supervised Contrastive Learning.

Papers Using This Method

Enhancing Few-shot Keyword Spotting Performance through Pre-Trained Self-supervised Speech Models2025-06-21Towards Large-Scale Pose-Invariant Face Recognition Using Face Defrontalization2025-06-04Accuracy and Fairness of Facial Recognition Technology in Low-Quality Police Images: An Experiment With Synthetic Faces2025-05-20LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images2025-03-20Universal Embedding Function for Traffic Classification via QUIC Domain Recognition Pretraining: A Transfer Learning Success2025-02-18Omni-ID: Holistic Identity Representation Designed for Generative Tasks2024-12-12Multispecies Animal Re-ID Using a Large Community-Curated Dataset2024-12-07Pairwise Discernment of AffectNet Expressions with ArcFace2024-12-01Hypersphere Secure Sketch Revisited: Probabilistic Linear Regression Attack on IronMask in Multiple Usage2024-09-19HyperSpaceX: Radial and Angular Exploration of HyperSpherical Dimensions2024-08-05Analyzing the Feature Extractor Networks for Face Image Synthesis2024-06-04Deep Privacy Funnel Model: From a Discriminative to a Generative Approach with an Application to Face Recognition2024-04-03Arc2Face: A Foundation Model for ID-Consistent Human Faces2024-03-18VIGFace: Virtual Identity Generation for Privacy-Free Face Recognition2024-03-13Mitigating the Impact of Attribute Editing on Face Recognition2024-03-12X2-Softmax: Margin Adaptive Loss Function for Face Recognition2023-12-08Improved Face Representation via Joint Label Classification and Supervised Contrastive Clustering2023-12-07A Universal Anti-Spoofing Approach for Contactless Fingerprint Biometric Systems2023-10-23An Empirical Study of Self-supervised Learning with Wasserstein Distance2023-10-16Trading-off Mutual Information on Feature Aggregation for Face Recognition2023-09-22