Universal Sentence Encoder

Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

2018-03-29Text Classification Subjectivity Analysis Sentiment Analysis Transfer Learning Sentence Embeddings Semantic Textual Similarity Word Embeddings Conversational Response Selection

Paper PDF Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code

Abstract

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the encoding models allow for trade-offs between accuracy and compute resources. For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. Comparisons are made with baselines that use word level transfer learning via pretrained word embeddings as well as baselines do not use any transfer learning. We find that transfer learning using sentence embeddings tends to outperform word level transfer. With transfer learning via sentence embeddings, we observe surprisingly good performance with minimal amounts of supervised training data for a transfer task. We obtain encouraging results on Word Embedding Association Tests (WEAT) targeted at detecting model bias. Our pre-trained sentence encoding models are made freely available for download and on TF Hub.

Results

Task	Dataset	Metric	Value	Model
Semantic Textual Similarity	STS Benchmark	Pearson Correlation	0.782	USE_T
Sentiment Analysis	CR	Accuracy	87.45	USE_T+CNN (w2v w.e.)
Sentiment Analysis	MR	Accuracy	81.59	USE_T+CNN
Sentiment Analysis	SST-2 Binary classification	Accuracy	87.21	USE_T+CNN (lrn w.e.)
Sentiment Analysis	MPQA	Accuracy	88.14	USE_T+DAN (w2v w.e.)
Subjectivity Analysis	SUBJ	Accuracy	93.9	USE
Text Classification	TREC-6	Error	1.93	USE_T+CNN
Classification	TREC-6	Error	1.93	USE_T+CNN

Universal Sentence Encoder

Abstract

Results

Related Papers

Universal Sentence Encoder

Abstract

Results

Related Papers