TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/MassSpecGym

MassSpecGym

MassSpecGym: A benchmark for the discovery and identification of molecules

BiologyMITIntroduced 2024-10-30

MassSpecGym provides three challenges for benchmarking the discovery and identification of new molecules from MS/MS spectra:

  • 💥 De novo molecule generation (MS/MS spectrum → molecular structure)
    • ✨ Bonus chemical formulae challenge (MS/MS spectrum + chemical formula → molecular structure)
  • 💥 Molecule retrieval (MS/MS spectrum → ranked list of candidate molecular structures)
    • ✨ Bonus chemical formulae challenge (MS/MS spectrum → ranked list of candidate molecular structures with ground-truth chemical formulae)
  • 💥 Spectrum simulation (molecular structure → MS/MS spectrum)
    • ✨ Bonus chemical formulae challenge (molecular structure → MS/MS spectrum; evaluated on the retrieval of molecular structures with ground-truth chemical formulae)

The provided challenges abstract the process of scientific discovery from biological and environmental samples into well-defined machine learning problems with pre-defined datasets, data splits, and evaluation metrics.

Benchmarks

De novo molecule generation from MS/MS spectrum/Top-1 AccuracyDe novo molecule generation from MS/MS spectrum/Top-1 MCESDe novo molecule generation from MS/MS spectrum/Top-1 TanimotoDe novo molecule generation from MS/MS spectrum/Top-10 AccuracyDe novo molecule generation from MS/MS spectrum/Top-10 MCESDe novo molecule generation from MS/MS spectrum/Top-10 TanimotoDe novo molecule generation from MS/MS spectrum (bonus chemical formulae)/Top-1 AccuracyDe novo molecule generation from MS/MS spectrum (bonus chemical formulae)/Top-1 MCESDe novo molecule generation from MS/MS spectrum (bonus chemical formulae)/Top-1 TanimotoDe novo molecule generation from MS/MS spectrum (bonus chemical formulae)/Top-10 AccuracyDe novo molecule generation from MS/MS spectrum (bonus chemical formulae)/Top-10 MCESDe novo molecule generation from MS/MS spectrum (bonus chemical formulae)/Top-10 TanimotoMS/MS spectrum simulation/Cosine SimilarityMS/MS spectrum simulation/Jensen-Shannon SimilarityMS/MS spectrum simulation/Hit Rate @ 1MS/MS spectrum simulation/Hit Rate @ 5MS/MS spectrum simulation/Hit Rate @ 20MS/MS spectrum simulation (bonus chemical formulae)/Hit Rate @ 1MS/MS spectrum simulation (bonus chemical formulae)/Hit Rate @ 5MS/MS spectrum simulation (bonus chemical formulae)/Hit Rate @ 20Molecule retrieval from MS/MS spectrum/Hit rate @ 1Molecule retrieval from MS/MS spectrum/Hit rate @ 5Molecule retrieval from MS/MS spectrum/Hit rate @ 20Molecule retrieval from MS/MS spectrum/MCES @ 1Molecule retrieval from MS/MS spectrum (bonus chemical formulae)/Hit rate @ 1Molecule retrieval from MS/MS spectrum (bonus chemical formulae)/Hit rate @ 5Molecule retrieval from MS/MS spectrum (bonus chemical formulae)/Hit rate @ 20Molecule retrieval from MS/MS spectrum (bonus chemical formulae)/MCES @ 1

Statistics

Papers
4
Benchmarks
28

Links

Homepage

Tasks

De novo molecule generation from MS/MS spectrumDe novo molecule generation from MS/MS spectrum (bonus chemical formulae)MS/MS spectrum simulationMS/MS spectrum simulation (bonus chemical formulae)Molecule retrieval from MS/MS spectrumMolecule retrieval from MS/MS spectrum (bonus chemical formulae)