TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DiSAN: Directional Self-Attention Network for RNN/CNN-Free...

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, Chengqi Zhang

2017-09-14Natural Language InferenceSentence EmbeddingSentence-Embedding
PaperPDFCode(official)CodeCode

Abstract

Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively. Attention mechanisms have recently attracted enormous interest due to their highly parallelizable computation, significantly less training time, and flexibility in modeling dependencies. We propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). A light-weight neural net, "Directional Self-Attention Network (DiSAN)", is then proposed to learn sentence embedding, based solely on the proposed attention without any RNN/CNN structure. DiSAN is only composed of a directional self-attention with temporal order encoded, followed by a multi-dimensional attention that compresses the sequence into a vector representation. Despite its simple form, DiSAN outperforms complicated RNN models on both prediction quality and time efficiency. It achieves the best test accuracy among all sentence encoding methods and improves the most recent best result by 1.02% on the Stanford Natural Language Inference (SNLI) dataset, and shows state-of-the-art test accuracy on the Stanford Sentiment Treebank (SST), Multi-Genre natural language inference (MultiNLI), Sentences Involving Compositional Knowledge (SICK), Customer Review, MPQA, TREC question-type classification and Subjectivity (SUBJ) datasets.

Results

TaskDatasetMetricValueModel
Natural Language InferenceSNLI% Test Accuracy85.6300D Directional self-attention network encoders
Natural Language InferenceSNLI% Train Accuracy91.1300D Directional self-attention network encoders

Related Papers

LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact Verification2025-07-08ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation2025-06-27Intrinsic vs. Extrinsic Evaluation of Czech Sentence Embeddings: Semantic Relevance Doesn't Help with MT Evaluation2025-06-25Thunder-NUBench: A Benchmark for LLMs' Sentence-Level Negation Understanding2025-06-17When Does Meaning Backfire? Investigating the Role of AMRs in NLI2025-06-17Explainable Compliance Detection with Multi-Hop Natural Language Inference on Assurance Case Structure2025-06-10Factors affecting the in-context learning abilities of LLMs for dialogue state tracking2025-06-10