TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SMART: Robust and Efficient Fine-Tuning for Pre-trained Na...

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao

2019-11-08ACL 2020 6Paraphrase IdentificationSentiment AnalysisNatural Language InferenceNatural Language UnderstandingTransfer LearningSemantic Textual SimilarityLinguistic Acceptability
PaperPDFCode(official)CodeCodeCodeCodeCode

Abstract

Transfer learning has fundamentally changed the landscape of natural language processing (NLP) research. Many existing state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However, due to limited data resources from downstream tasks and the extremely large capacity of pre-trained models, aggressive fine-tuning often causes the adapted model to overfit the data of downstream tasks and forget the knowledge of the pre-trained model. To address the above issue in a more principled manner, we propose a new computational framework for robust and efficient fine-tuning for pre-trained language models. Specifically, our proposed framework contains two important ingredients: 1. Smoothness-inducing regularization, which effectively manages the capacity of the model; 2. Bregman proximal point optimization, which is a class of trust-region methods and can prevent knowledge forgetting. Our experiments demonstrate that our proposed method achieves the state-of-the-art performance on multiple NLP benchmarks.

Results

TaskDatasetMetricValueModel
Natural Language InferenceAXAccuracy53.1T5
Natural Language InferenceSciTailDev Accuracy96.1MT-DNN-SMART_100%ofTrainingData
Natural Language InferenceSciTailDev Accuracy91.3MT-DNN-SMART_10%ofTrainingData
Natural Language InferenceSciTailDev Accuracy88.6MT-DNN-SMART_1%ofTrainingData
Natural Language InferenceSciTailDev Accuracy82.3MT-DNN-SMART_0.1%ofTrainingData
Natural Language InferenceSciTail% Dev Accuracy96.6MT-DNN-SMARTLARGEv0
Natural Language InferenceSciTail% Test Accuracy95.2MT-DNN-SMARTLARGEv0
Natural Language InferenceMNLI + SNLI + ANLI + FEVER% Dev Accuracy57.1SMARTRoBERTa-LARGE
Natural Language InferenceMNLI + SNLI + ANLI + FEVER% Test Accuracy57.1SMARTRoBERTa-LARGE
Natural Language InferenceSNLI% Dev Accuracy92.6MT-DNN-SMARTLARGEv0
Natural Language InferenceSNLI% Test Accuracy91.7MT-DNN-SMARTLARGEv0
Natural Language InferenceSNLIDev Accuracy91.6MT-DNN-SMART_100%ofTrainingData
Natural Language InferenceSNLIDev Accuracy88.7MT-DNN-SMART_10%ofTrainingData
Natural Language InferenceSNLIDev Accuracy86MT-DNN-SMART_1%ofTrainingData
Natural Language InferenceSNLIDev Accuracy82.7MT-DNN-SMART_0.1%ofTrainingData
Natural Language InferenceMultiNLIMatched92T5
Natural Language InferenceMultiNLIMismatched91.7T5
Natural Language InferenceMultiNLIAccuracy85.7MT-DNN-SMARTv0
Natural Language InferenceMultiNLIAccuracy85.7MT-DNN-SMART
Natural Language InferenceMultiNLIAccuracy85.6SMART+BERT-BASE
Natural Language InferenceMultiNLIDev Matched91.1SMARTRoBERTa
Natural Language InferenceMultiNLIDev Mismatched91.3SMARTRoBERTa
Natural Language InferenceMultiNLIDev Matched85.6SMART-BERT
Natural Language InferenceMultiNLIDev Mismatched86SMART-BERT
Semantic Textual SimilarityMRPCF191.7MT-DNN-SMART
Semantic Textual SimilaritySTS BenchmarkPearson Correlation0.929MT-DNN-SMART
Semantic Textual SimilaritySTS BenchmarkSpearman Correlation0.925MT-DNN-SMART
Semantic Textual SimilaritySTS BenchmarkDev Pearson Correlation92.8SMARTRoBERTa
Semantic Textual SimilaritySTS BenchmarkDev Spearman Correlation92.6SMARTRoBERTa
Semantic Textual SimilaritySTS BenchmarkDev Pearson Correlation90SMART-BERT
Semantic Textual SimilaritySTS BenchmarkDev Spearman Correlation89.4SMART-BERT
Semantic Textual SimilarityQuora Question PairsF190.7ALICE
Semantic Textual SimilarityQuora Question PairsAccuracy74.8FreeLB
Semantic Textual SimilarityQuora Question PairsDev Accuracy92.6FreeLB
Semantic Textual SimilarityQuora Question PairsDev Accuracy91.5SMART-BERT
Semantic Textual SimilarityQuora Question PairsDev F188.5SMART-BERT
Sentiment AnalysisSST-2 Binary classificationAccuracy97.5MT-DNN-SMART
Sentiment AnalysisSST-2 Binary classificationAccuracy93.6MT-DNN
Sentiment AnalysisSST-2 Binary classificationAccuracy93SMART+BERT-BASE
Sentiment AnalysisSST-2 Binary classificationDev Accuracy96.9SMARTRoBERTa
Sentiment AnalysisSST-2 Binary classificationDev Accuracy96.1SMART-MT-DNN
Sentiment AnalysisSST-2 Binary classificationDev Accuracy93SMART-BERT
Paraphrase IdentificationQuora Question PairsF190.7ALICE
Paraphrase IdentificationQuora Question PairsAccuracy74.8FreeLB
Paraphrase IdentificationQuora Question PairsDev Accuracy92.6FreeLB
Paraphrase IdentificationQuora Question PairsDev Accuracy91.5SMART-BERT
Paraphrase IdentificationQuora Question PairsDev F188.5SMART-BERT
Natural Language UnderstandingGLUEAverage89.9MT-DNN-SMART

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15