Qian Chen, Zhen-Hua Ling, Xiaodan Zhu
Pooling is an essential component of a wide variety of sentence representation and embedding models. This paper explores generalized pooling methods to enhance sentence embedding. We propose vector-based multi-head attention that includes the widely used max pooling, mean pooling, and scalar self-attention as special cases. The model benefits from properly designed penalization terms to reduce redundancy in multi-head attention. We evaluate the proposed model on three different tasks: natural language inference (NLI), author profiling, and sentiment classification. The experiments show that the proposed model achieves significant improvement over strong sentence-encoding-based methods, resulting in state-of-the-art performances on four datasets. The proposed approach can be easily implemented for more problems than we discuss in this paper.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Natural Language Inference | SNLI | % Test Accuracy | 86.6 | 600D BiLSTM with generalized pooling |
| Natural Language Inference | SNLI | % Train Accuracy | 94.9 | 600D BiLSTM with generalized pooling |
| Sentiment Analysis | Yelp Fine-grained classification | Error | 33.45 | BiLSTM generalized pooling |