Pruning and Sparsemax Methods for Hierarchical Attention Networks

João G. Ribeiro, Frederico S. Felisberto, Isabel C. Neto

2020-04-08Sentiment Analysis Document Classification General Classification

Abstract

This paper introduces and evaluates two novel Hierarchical Attention Network models [Yang et al., 2016] - i) Hierarchical Pruned Attention Networks, which remove the irrelevant words and sentences from the classification process in order to reduce potential noise in the document classification accuracy and ii) Hierarchical Sparsemax Attention Networks, which replace the Softmax function used in the attention mechanism with the Sparsemax [Martins and Astudillo, 2016], capable of better handling importance distributions where a lot of words or sentences have very low probabilities. Our empirical evaluation on the IMDB Review for sentiment analysis datasets shows both approaches to be able to match the results obtained by the current state-of-the-art (without, however, any significant benefits). All our source code is made available athttps://github.com/jmribeiro/dsl-project.

Related Papers

AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17 AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15 DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15 SentiDrop: A Multi Modal Machine Learning model for Predicting Dropout in Distance Learning2025-07-14 GNN-CNN: An Efficient Hybrid Model of Convolutional and Graph Neural Networks for Text Representation2025-07-10 FINN-GL: Generalized Mixed-Precision Extensions for FPGA-Accelerated LSTMs2025-06-25 Unpacking Generative AI in Education: Computational Modeling of Teacher and Student Perspectives in Social Media Discourse2025-06-19 Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings2025-06-16