TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ZeroGen: Efficient Zero-shot Learning via Dataset Generation

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

Jiacheng Ye, Jiahui Gao, Qintong Li, Hang Xu, Jiangtao Feng, Zhiyong Wu, Tao Yu, Lingpeng Kong

2022-02-16Text ClassificationQuestion AnsweringText GenerationNatural Language InferenceData-free Knowledge Distillationtext-classificationKnowledge DistillationZero-Shot Learning
PaperPDFCode(official)CodeCode

Abstract

There is a growing interest in dataset generation recently due to the superior generative capacity of large pre-trained language models (PLMs). In this paper, we study a flexible and efficient zero-short learning method, \textsc{ZeroGen}. Given a zero-shot task, we first generate a dataset from scratch using PLMs in an unsupervised manner. Then, we train a tiny task model (e.g., LSTM) under the supervision of the synthesized dataset. This approach allows highly efficient inference as the final task model only has orders of magnitude fewer parameters comparing to PLMs (e.g., GPT2-XL). Apart from being annotation-free and efficient, we argue that \textsc{ZeroGen} can also provide useful insights from the perspective of data-free model-agnostic knowledge distillation, and unreferenced text generation evaluation. Experiments and analysis on different NLP tasks, namely, text classification, question answering, and natural language inference, show the effectiveness of \textsc{ZeroGen}.

Results

TaskDatasetMetricValueModel
Knowledge DistillationSQuADExact Match69.4ZeroGen (T5-base)
Knowledge DistillationQNLIAccuracy88.5ZeroGen (T5-base)
Data-free Knowledge DistillationSQuADExact Match69.4ZeroGen (T5-base)
Data-free Knowledge DistillationQNLIAccuracy88.5ZeroGen (T5-base)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Making Language Model a Hierarchical Classifier and Generator2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces2025-07-17GLAD: Generalizable Tuning for Vision-Language Models2025-07-17