TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Selective In-Context Data Augmentation for Intent Detectio...

Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information

Yen-Ting Lin, Alexandros Papangelis, Seokhwan Kim, Sungjin Lee, Devamanyu Hazarika, Mahdi Namazifar, Di Jin, Yang Liu, Dilek Hakkani-Tur

2023-02-10Text ClassificationIntent DetectionData Augmentation
PaperPDF

Abstract

This work focuses on in-context data augmentation for intent detection. Having found that augmentation via in-context prompting of large pre-trained language models (PLMs) alone does not improve performance, we introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model. Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents. It then employs intent-aware filtering, based on PVI, to remove datapoints that are not helpful to the downstream intent classifier. Our method is thus able to leverage the expressive power of large language models to produce diverse training data. Empirical results demonstrate that our method can produce synthetic training data that achieve state-of-the-art performance on three challenging intent detection datasets under few-shot settings (1.28% absolute improvement in 5-shot and 1.18% absolute in 10-shot, on average) and perform on par with the state-of-the-art in full-shot settings (within 0.01% absolute, on average).

Results

TaskDatasetMetricValueModel
Intent DetectionCLINC150 10-shotAccuracy (%)94.84RoBERTa-Large + ICDA
Intent DetectionHWU64 10-shotAccuracy (%)87.41RoBERTa-Large + ICDA
Intent DetectionHWU64Accuracy (%)92.57RoBERTa-Large + ICDA
Intent DetectionBANKING77Accuracy (%)94.42RoBERTa-Large + ICDA
Intent DetectionCLINC150Accuracy (%)97.12RoBERTa-Large + ICDA
Intent DetectionHWU64 5-shotAccuracy (%)82.45RoBERTa-Large + ICDA
Intent DetectionBANKING77 5-shotAccuracy (%)84.01RoBERTa-Large + ICDA
Intent DetectionCLINC150 5-shotAccuracy (%)92.62RoBERTa-Large + ICDA
Intent DetectionBANKING77 10-shotAccuracy (%)89.79RoBERTa-Large + ICDA
Text ClassificationBANKING77Accuracy94.42RoBERTa-Large + ICDA
ClassificationBANKING77Accuracy94.42RoBERTa-Large + ICDA

Related Papers

Making Language Model a Hierarchical Classifier and Generator2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Data Augmentation in Time Series Forecasting through Inverted Framework2025-07-15Iceberg: Enhancing HLS Modeling with Synthetic Data2025-07-14AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13FreeAudio: Training-Free Timing Planning for Controllable Long-Form Text-to-Audio Generation2025-07-11