TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Entailment as Few-Shot Learner

Entailment as Few-Shot Learner

Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma

2021-04-29Subjectivity AnalysisQuestion AnsweringFew-Shot LearningTopic ClassificationParaphrase IdentificationSentiment AnalysisNatural Language InferenceData AugmentationSemantic Textual SimilarityLinguistic AcceptabilityContrastive Learning
PaperPDFCodeCodeCode

Abstract

Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners. However, their success hinges largely on scaling model parameters to a degree that makes it challenging to train and serve. In this paper, we propose a new approach, named as EFL, that can turn small LMs into better few-shot learners. The key idea of this approach is to reformulate potential NLP task into an entailment one, and then fine-tune the model with as little as 8 examples. We further demonstrate our proposed method can be: (i) naturally combined with an unsupervised contrastive learning-based data augmentation method; (ii) easily extended to multilingual few-shot learning. A systematic evaluation on 18 standard NLP tasks demonstrates that this approach improves the various existing SOTA few-shot learning methods by 12\%, and yields competitive few-shot performance with 500 times larger models, such as GPT-3.

Results

TaskDatasetMetricValueModel
Question AnsweringBoolQAccuracy86RoBERTa-large 355M + Entailment as Few-shot Learner
Natural Language InferenceSNLI% Test Accuracy93.1Neural Tree Indexers for Text Understanding
Natural Language InferenceSNLIParameters355Neural Tree Indexers for Text Understanding
Natural Language InferenceSNLI% Test Accuracy93.1EFL (Entailment as Few-shot Learner) + RoBERTa-large
Semantic Textual SimilarityMRPCF191RoBERTa-large 355M + Entailment as Few-shot Learner
Semantic Textual SimilaritySTS BenchmarkPearson Correlation0.918RoBERTa-large 355M + Entailment as Few-shot Learner
Semantic Textual SimilarityQuora Question PairsF189.2RoBERTa-large 355M + Entailment as Few-shot Learner
Sentiment AnalysisCRAccuracy92.5RoBERTa-large 355M + Entailment as Few-shot Learner
Sentiment AnalysisMRAccuracy92.5RoBERTa-large 355M + Entailment as Few-shot Learner
Sentiment AnalysisSST-2 Binary classificationAccuracy96.9RoBERTa-large 355M + Entailment as Few-shot Learner
Sentiment AnalysisIMDbAccuracy96.1RoBERTa-large 355M + Entailment as Few-shot Learner
Sentiment AnalysisMPQAAccuracy90.8RoBERTa-large 355M + Entailment as Few-shot Learner
Subjectivity AnalysisSUBJAccuracy97.1RoBERTa-large 355M + Entailment as Few-shot Learner
Paraphrase IdentificationQuora Question PairsF189.2RoBERTa-large 355M + Entailment as Few-shot Learner

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17GLAD: Generalizable Tuning for Vision-Language Models2025-07-17AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17