TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Poly-encoders: Transformer Architectures and Pre-training ...

Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring

Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, Jason Weston

2019-04-22Conversational Response Selection
PaperPDFCodeCodeCodeCodeCodeCodeCode

Abstract

The use of deep pre-trained bidirectional transformers has led to remarkable progress in a number of applications (Devlin et al., 2018). For tasks that make pairwise comparisons between sequences, matching a given input with a corresponding label, two approaches are common: Cross-encoders performing full self-attention over the pair and Bi-encoders encoding the pair separately. The former often performs better, but is too slow for practical use. In this work, we develop a new transformer architecture, the Poly-encoder, that learns global rather than token level self-attention features. We perform a detailed comparison of all three approaches, including what pre-training and fine-tuning strategies work best. We show our models achieve state-of-the-art results on three existing tasks; that Poly-encoders are faster than Cross-encoders and more accurate than Bi-encoders; and that the best results are obtained by pre-training on large datasets similar to the downstream tasks.

Results

TaskDatasetMetricValueModel
Conversational Response SelectionDoubanMAP0.608Poly-encoder
Conversational Response SelectionDoubanMRR0.65Poly-encoder
Conversational Response SelectionDoubanP@10.475Poly-encoder
Conversational Response SelectionDoubanR10@10.299Poly-encoder
Conversational Response SelectionDoubanR10@20.494Poly-encoder
Conversational Response SelectionDoubanR10@50.822Poly-encoder
Conversational Response SelectionRRS Ranking TestNDCG@30.679Poly-encoder
Conversational Response SelectionRRS Ranking TestNDCG@50.765Poly-encoder
Conversational Response SelectionUbuntu Dialogue (v1, Ranking)R10@10.882Poly-encoder
Conversational Response SelectionUbuntu Dialogue (v1, Ranking)R10@20.949Poly-encoder
Conversational Response SelectionUbuntu Dialogue (v1, Ranking)R10@50.99Poly-encoder

Related Papers

Efficient Dynamic Hard Negative Sampling for Dialogue Selection2024-08-16P5: Plug-and-Play Persona Prompting for Personalized Response Selection2023-10-10Knowledge-aware response selection with semantics underlying multi-turn open-domain conversations2023-07-27Dial-MAE: ConTextual Masked Auto-Encoder for Retrieval-based Dialogue Systems2023-06-07Learning Dialogue Representations from Consecutive Utterances2022-05-26One Agent To Rule Them All: Towards Multi-agent Conversational AI2022-03-15Two-Level Supervised Contrastive Learning for Response Selection in Multi-Turn Dialogue2022-03-01Small Changes Make Big Differences: Improving Multi-turn Response Selection in Dialogue Systems via Fine-Grained Contrastive Learning2021-11-19