TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CLASS: Enhancing Cross-Modal Text-Molecule Retrieval Perfo...

CLASS: Enhancing Cross-Modal Text-Molecule Retrieval Performance and Training Efficiency

Hongyan Wu, Peijian Zeng, Weixiong Zheng, Lianxi Wang, Nankai Lin, Shengyi Jiang, Aimin Yang

2025-02-17Cross-Modal RetrievalRetrieval
PaperPDF

Abstract

Cross-modal text-molecule retrieval task bridges molecule structures and natural language descriptions. Existing methods predominantly focus on aligning text modality and molecule modality, yet they overlook adaptively adjusting the learning states at different training stages and enhancing training efficiency. To tackle these challenges, this paper proposes a Curriculum Learning-bAsed croSS-modal text-molecule training framework (CLASS), which can be integrated with any backbone to yield promising performance improvement. Specifically, we quantify the sample difficulty considering both text modality and molecule modality, and design a sample scheduler to introduce training samples via an easy-to-difficult paradigm as the training advances, remarkably reducing the scale of training samples at the early stage of training and improving training efficiency. Moreover, we introduce adaptive intensity learning to increase the training intensity as the training progresses, which adaptively controls the learning intensity across all curriculum stages. Experimental results on the ChEBI-20 dataset demonstrate that our proposed method gains superior performance, simultaneously achieving prominent time savings.

Results

TaskDatasetMetricValueModel
Image Retrieval with Multi-Modal QueryChEBI-20Hits@167.4CLASS (ORMA)
Image Retrieval with Multi-Modal QueryChEBI-20Hits@1093.4CLASS (ORMA)
Image Retrieval with Multi-Modal QueryChEBI-20Mean Rank17.82CLASS (ORMA)
Image Retrieval with Multi-Modal QueryChEBI-20Test MRR77.4CLASS (ORMA)
Image Retrieval with Multi-Modal QueryChEBI-20Hits@151.1CLASS (AMAN)
Image Retrieval with Multi-Modal QueryChEBI-20Hits@1092.6CLASS (AMAN)
Image Retrieval with Multi-Modal QueryChEBI-20Mean Rank16.8CLASS (AMAN)
Image Retrieval with Multi-Modal QueryChEBI-20Test MRR66CLASS (AMAN)
Cross-Modal Information RetrievalChEBI-20Hits@167.4CLASS (ORMA)
Cross-Modal Information RetrievalChEBI-20Hits@1093.4CLASS (ORMA)
Cross-Modal Information RetrievalChEBI-20Mean Rank17.82CLASS (ORMA)
Cross-Modal Information RetrievalChEBI-20Test MRR77.4CLASS (ORMA)
Cross-Modal Information RetrievalChEBI-20Hits@151.1CLASS (AMAN)
Cross-Modal Information RetrievalChEBI-20Hits@1092.6CLASS (AMAN)
Cross-Modal Information RetrievalChEBI-20Mean Rank16.8CLASS (AMAN)
Cross-Modal Information RetrievalChEBI-20Test MRR66CLASS (AMAN)
Cross-Modal RetrievalChEBI-20Hits@167.4CLASS (ORMA)
Cross-Modal RetrievalChEBI-20Hits@1093.4CLASS (ORMA)
Cross-Modal RetrievalChEBI-20Mean Rank17.82CLASS (ORMA)
Cross-Modal RetrievalChEBI-20Test MRR77.4CLASS (ORMA)
Cross-Modal RetrievalChEBI-20Hits@151.1CLASS (AMAN)
Cross-Modal RetrievalChEBI-20Hits@1092.6CLASS (AMAN)
Cross-Modal RetrievalChEBI-20Mean Rank16.8CLASS (AMAN)
Cross-Modal RetrievalChEBI-20Test MRR66CLASS (AMAN)

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker2025-07-16Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos2025-07-16Context-Aware Search and Retrieval Over Erasure Channels2025-07-16Seq vs Seq: An Open Suite of Paired Encoders and Decoders2025-07-15