TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Improving Document Classification with Multi-Sense Embeddi...

Improving Document Classification with Multi-Sense Embeddings

Vivek Gupta, Ankit Saw, Pegah Nokhiz, Harshit Gupta, Partha Talukdar

2019-11-18Text CategorizationWord EmbeddingsClusteringDocument ClassificationGeneral ClassificationClassification
PaperPDFCode(official)

Abstract

Efficient representation of text documents is an important building block in many NLP tasks. Research on long text categorization has shown that simple weighted averaging of word vectors for sentence representation often outperforms more sophisticated neural models. Recently proposed Sparse Composite Document Vector (SCDV) (Mekala et. al, 2017) extends this approach from sentences to documents using soft clustering over word vectors. However, SCDV disregards the multi-sense nature of words, and it also suffers from the curse of higher dimensionality. In this work, we address these shortcomings and propose SCDV-MS. SCDV-MS utilizes multi-sense word embeddings and learns a lower dimensional manifold. Through extensive experiments on multiple real-world datasets, we show that SCDV-MS embeddings outperform previous state-of-the-art embeddings on multi-class and multi-label text categorization tasks. Furthermore, SCDV-MS embeddings are more efficient than SCDV in terms of time and space complexity on textual classification tasks.

Results

TaskDatasetMetricValueModel
Text Classification20NEWSAccuracy86.19SCDV-MS
Text Classification20NEWSF-measure86.16SCDV-MS
Text Classification20NEWSPrecision86.2SCDV-MS
Text Classification20NEWSRecall86.18SCDV-MS
Text ClassificationReuters-21578F182.71SCDV-MS
Document ClassificationReuters-21578F182.71SCDV-MS
Classification20NEWSAccuracy86.19SCDV-MS
Classification20NEWSF-measure86.16SCDV-MS
Classification20NEWSPrecision86.2SCDV-MS
Classification20NEWSRecall86.18SCDV-MS
ClassificationReuters-21578F182.71SCDV-MS

Related Papers

Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Ranking Vectors Clustering: Theory and Applications2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework2025-07-11Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09