Kai Sheng Tai, Richard Socher, Christopher D. Manning
Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks. The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properties that would naturally combine words to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. Tree-LSTMs outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task 1) and sentiment classification (Stanford Sentiment Treebank).
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Language Modelling | SICK | MSE | 0.2532 | Dependency Tree-LSTM (Tai et al., 2015) |
| Language Modelling | SICK | Pearson Correlation | 0.8676 | Dependency Tree-LSTM (Tai et al., 2015) |
| Language Modelling | SICK | Spearman Correlation | 0.8083 | Dependency Tree-LSTM (Tai et al., 2015) |
| Language Modelling | SICK | MSE | 0.2736 | Bidirectional LSTM (Tai et al., 2015) |
| Language Modelling | SICK | Pearson Correlation | 0.8567 | Bidirectional LSTM (Tai et al., 2015) |
| Language Modelling | SICK | Spearman Correlation | 0.7966 | Bidirectional LSTM (Tai et al., 2015) |
| Language Modelling | SICK | MSE | 0.2831 | LSTM (Tai et al., 2015) |
| Language Modelling | SICK | Pearson Correlation | 0.8528 | LSTM (Tai et al., 2015) |
| Language Modelling | SICK | Spearman Correlation | 0.7911 | LSTM (Tai et al., 2015) |
| Sentiment Analysis | SST-5 Fine-grained classification | Accuracy | 51 | Constituency Tree-LSTM |
| Sentiment Analysis | SST-2 Binary classification | Accuracy | 88 | Consistency Tree LSTM with tuned Glove vectors [tai2015improved] |
| Sentiment Analysis | SST-2 Binary classification | Accuracy | 86.3 | 2-layer LSTM [tai2015improved] |
| Sentence Pair Modeling | SICK | MSE | 0.2532 | Dependency Tree-LSTM (Tai et al., 2015) |
| Sentence Pair Modeling | SICK | Pearson Correlation | 0.8676 | Dependency Tree-LSTM (Tai et al., 2015) |
| Sentence Pair Modeling | SICK | Spearman Correlation | 0.8083 | Dependency Tree-LSTM (Tai et al., 2015) |
| Sentence Pair Modeling | SICK | MSE | 0.2736 | Bidirectional LSTM (Tai et al., 2015) |
| Sentence Pair Modeling | SICK | Pearson Correlation | 0.8567 | Bidirectional LSTM (Tai et al., 2015) |
| Sentence Pair Modeling | SICK | Spearman Correlation | 0.7966 | Bidirectional LSTM (Tai et al., 2015) |
| Sentence Pair Modeling | SICK | MSE | 0.2831 | LSTM (Tai et al., 2015) |
| Sentence Pair Modeling | SICK | Pearson Correlation | 0.8528 | LSTM (Tai et al., 2015) |
| Sentence Pair Modeling | SICK | Spearman Correlation | 0.7911 | LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | MSE | 0.2532 | Dependency Tree-LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | Pearson Correlation | 0.8676 | Dependency Tree-LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | Spearman Correlation | 0.8083 | Dependency Tree-LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | MSE | 0.2736 | Bidirectional LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | Pearson Correlation | 0.8567 | Bidirectional LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | Spearman Correlation | 0.7966 | Bidirectional LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | MSE | 0.2831 | LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | Pearson Correlation | 0.8528 | LSTM (Tai et al., 2015) |
| Semantic Similarity | SICK | Spearman Correlation | 0.7911 | LSTM (Tai et al., 2015) |