TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/EGraFFBench: Evaluation of Equivariant Graph Neural Networ...

EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations

Vaibhav Bihani, Utkarsh Pratiush, Sajid Mannan, Tao Du, Zhimin Chen, Santiago Miret, Matthieu Micoulaut, Morten M Smedskjaer, Sayan Ranu, N M Anoop Krishnan

2023-10-03BenchmarkingAtomic ForcesFormation Energy
PaperPDF

Abstract

Equivariant graph neural networks force fields (EGraFFs) have shown great promise in modelling complex interactions in atomic systems by exploiting the graphs' inherent symmetries. Recent works have led to a surge in the development of novel architectures that incorporate equivariance-based inductive biases alongside architectural innovations like graph transformers and message passing to model atomic interactions. However, thorough evaluations of these deploying EGraFFs for the downstream task of real-world atomistic simulations, is lacking. To this end, here we perform a systematic benchmarking of 6 EGraFF algorithms (NequIP, Allegro, BOTNet, MACE, Equiformer, TorchMDNet), with the aim of understanding their capabilities and limitations for realistic atomistic simulations. In addition to our thorough evaluation and analysis on eight existing datasets based on the benchmarking literature, we release two new benchmark datasets, propose four new metrics, and three challenging tasks. The new datasets and tasks evaluate the performance of EGraFF to out-of-distribution data, in terms of different crystal structures, temperatures, and new molecules. Interestingly, evaluation of the EGraFF models based on dynamic simulations reveals that having a lower error on energy or force does not guarantee stable or reliable simulation or faithful replication of the atomic structures. Moreover, we find that no model clearly outperforms other models on all datasets and tasks. Importantly, we show that the performance of all the models on out-of-distribution datasets is unreliable, pointing to the need for the development of a foundation model for force fields that can be used in real-world simulations. In summary, this work establishes a rigorous framework for evaluating machine learning force fields in the context of atomic simulations and points to open research challenges within this domain.

Results

TaskDatasetMetricValueModel
Formation EnergyAspirinMAE14.36Allegro
Formation EnergyAspirinMAE13.79MACE
Formation EnergyAspirinMAE12.63BOTNet
Formation EnergyAspirinMAE9.27NequIP
Formation EnergyLiPS20MAE33.17Allegro
Formation EnergyLiPS20MAE26.8NequIP
Formation EnergyLiPS20MAE24.59BOTNet
Formation EnergyLiPS20MAE14.05MACE
Formation EnergyNaphthaleneMAE182.55BOTNet
Formation EnergyNaphthaleneMAE161.74MACE
Formation EnergyNaphthaleneMAE5.82Allegro
Formation EnergyNaphthaleneMAE2.66NequIP
Formation EnergyEthanolMAE209.96MACE
Formation EnergyEthanolMAE203.83BOTNet
Formation EnergyEthanolMAE6.94Allegro
Formation EnergyEthanolMAE4.99NequIP
Formation Energy3BPAMAE5BOTNet
Formation Energy3BPAMAE4.13Allegro
Formation Energy3BPAMAE4MACE
Formation Energy3BPAMAE3.15NequIP
Formation EnergyAcetylacetoneMAE2BOTNet
Formation EnergyAcetylacetoneMAE2MACE
Formation EnergyAcetylacetoneMAE1.38NequIP
Formation EnergyAcetylacetoneMAE0.92Allegro
Formation EnergyGeTeMAE3034BOTNet
Formation EnergyGeTeMAE2670MACE
Formation EnergyGeTeMAE1780.951NequIP
Formation EnergyGeTeMAE1009.4Allegro
Formation EnergySalicylic AcidMAE165.29MACE
Formation EnergySalicylic AcidMAE153.06BOTNet
Formation EnergySalicylic AcidMAE8.59Allegro
Formation EnergySalicylic AcidMAE6.29NequIP
Formation EnergyLiPSMAE165.43NequIP
Formation EnergyLiPSMAE31.75Allegro
Formation EnergyLiPSMAE30MACE
Formation EnergyLiPSMAE28BOTNet
Atomistic DescriptionAspirinMAE14.36Allegro
Atomistic DescriptionAspirinMAE13.79MACE
Atomistic DescriptionAspirinMAE12.63BOTNet
Atomistic DescriptionAspirinMAE9.27NequIP
Atomistic DescriptionLiPS20MAE33.17Allegro
Atomistic DescriptionLiPS20MAE26.8NequIP
Atomistic DescriptionLiPS20MAE24.59BOTNet
Atomistic DescriptionLiPS20MAE14.05MACE
Atomistic DescriptionNaphthaleneMAE182.55BOTNet
Atomistic DescriptionNaphthaleneMAE161.74MACE
Atomistic DescriptionNaphthaleneMAE5.82Allegro
Atomistic DescriptionNaphthaleneMAE2.66NequIP
Atomistic DescriptionEthanolMAE209.96MACE
Atomistic DescriptionEthanolMAE203.83BOTNet
Atomistic DescriptionEthanolMAE6.94Allegro
Atomistic DescriptionEthanolMAE4.99NequIP
Atomistic Description3BPAMAE5BOTNet
Atomistic Description3BPAMAE4.13Allegro
Atomistic Description3BPAMAE4MACE
Atomistic Description3BPAMAE3.15NequIP
Atomistic DescriptionAcetylacetoneMAE2BOTNet
Atomistic DescriptionAcetylacetoneMAE2MACE
Atomistic DescriptionAcetylacetoneMAE1.38NequIP
Atomistic DescriptionAcetylacetoneMAE0.92Allegro
Atomistic DescriptionGeTeMAE3034BOTNet
Atomistic DescriptionGeTeMAE2670MACE
Atomistic DescriptionGeTeMAE1780.951NequIP
Atomistic DescriptionGeTeMAE1009.4Allegro
Atomistic DescriptionSalicylic AcidMAE165.29MACE
Atomistic DescriptionSalicylic AcidMAE153.06BOTNet
Atomistic DescriptionSalicylic AcidMAE8.59Allegro
Atomistic DescriptionSalicylic AcidMAE6.29NequIP
Atomistic DescriptionLiPSMAE165.43NequIP
Atomistic DescriptionLiPSMAE31.75Allegro
Atomistic DescriptionLiPSMAE30MACE
Atomistic DescriptionLiPSMAE28BOTNet

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20Training Transformers with Enforced Lipschitz Constants2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15A Multi-View High-Resolution Foot-Ankle Complex Point Cloud Dataset During Gait for Occlusion-Robust 3D Completion2025-07-15FLsim: A Modular and Library-Agnostic Simulation Framework for Federated Learning2025-07-15