TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Acceptability Judgements via Examining the Topology of Att...

Acceptability Judgements via Examining the Topology of Attention Maps

Daniil Cherniavskii, Eduard Tulchinskii, Vladislav Mikhailov, Irina Proskurina, Laida Kushnareva, Ekaterina Artemova, Serguei Barannikov, Irina Piontkovskaya, Dmitri Piontkovski, Evgeny Burnaev

2022-05-19Topological Data AnalysisLinguistic Acceptability
PaperPDFCode(official)

Abstract

The role of the attention mechanism in encoding linguistic knowledge has received special interest in NLP. However, the ability of the attention heads to judge the grammatical acceptability of a sentence has been underexplored. This paper approaches the paradigm of acceptability judgments with topological data analysis (TDA), showing that the geometric properties of the attention graph can be efficiently exploited for two standard practices in linguistics: binary judgments and linguistic minimal pairs. Topological features enhance the BERT-based acceptability classifier scores by $8$%-$24$% on CoLA in three languages (English, Italian, and Swedish). By revealing the topological discrepancy between attention maps of minimal pairs, we achieve the human-level performance on the BLiMP benchmark, outperforming nine statistical and Transformer LM baselines. At the same time, TDA provides the foundation for analyzing the linguistic functions of attention heads and interpreting the correspondence between the graph features and grammatical phenomena.

Results

TaskDatasetMetricValueModel
Linguistic AcceptabilityCoLA DevAccuracy88.6En-BERT + TDA
Linguistic AcceptabilityCoLA DevMCC0.725En-BERT + TDA
Linguistic AcceptabilityCoLA DevAccuracy73XLM-R (pre-trained) + TDA
Linguistic AcceptabilityCoLA DevMCC0.42En-BERT (pre-trained) + TDA
Linguistic AcceptabilityDaLAJAccuracy76.9Sw-BERT + H0M
Linguistic AcceptabilityDaLAJMCC0.542Sw-BERT + H0M
Linguistic AcceptabilityCoLAMCC0.565En-BERT + TDA
Linguistic AcceptabilityItaCoLAAccuracy92.8XLM-R + TDA
Linguistic AcceptabilityItaCoLAMCC0.683XLM-R + TDA
Linguistic AcceptabilityItaCoLAAccuracy89.2It-BERT (pre-trained) + TDA
Linguistic AcceptabilityItaCoLAMCC0.478It-BERT (pre-trained) + TDA

Related Papers

Lipschitz Bounds for Persistent Laplacian Eigenvalues under One-Simplex Insertions2025-06-26The Shape of Consumer Behavior: A Symbolic and Topological Analysis of Time Series2025-06-24TDACloud: Point Cloud Recognition Using Topological Data Analysis2025-06-23Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective2025-06-05Torsion in Persistent Homology and Neural Networks2025-06-03An Incremental Framework for Topological Dialogue Semantics: Efficient Reasoning in Discrete Spaces2025-05-31Comparing the Effects of Persistence Barcodes Aggregation and Feature Concatenation on Medical Imaging2025-05-29Topological Machine Learning for Protein-Nucleic Acid Binding Affinity Changes Upon Mutation2025-05-28