TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/An Amharic News Text classification Dataset

An Amharic News Text classification Dataset

Israel Abebe Azime, Nebil Mohammed

2021-03-10Text Classificationtext-classificationGeneral ClassificationClassification
PaperPDFCode(official)

Abstract

In NLP, text classification is one of the primary problems we try to solve and its uses in language analyses are indisputable. The lack of labeled training data made it harder to do these tasks in low resource languages like Amharic. The task of collecting, labeling, annotating, and making valuable this kind of data will encourage junior researchers, schools, and machine learning practitioners to implement existing classification models in their language. In this short paper, we aim to introduce the Amharic text classification dataset that consists of more than 50k news articles that were categorized into 6 classes. This dataset is made available with easy baseline performances to encourage studies and better performance experiments.

Results

TaskDatasetMetricValueModel
Text ClassificationAn Amharic News Text classification DatasetAccuracy62.3Naive Bayes using Tf-idf features
Text ClassificationAn Amharic News Text classification DatasetAccuracy62.2Naive Bayes using count vectorizer features
ClassificationAn Amharic News Text classification DatasetAccuracy62.3Naive Bayes using Tf-idf features
ClassificationAn Amharic News Text classification DatasetAccuracy62.2Naive Bayes using count vectorizer features

Related Papers

Making Language Model a Hierarchical Classifier and Generator2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13GNN-CNN: An Efficient Hybrid Model of Convolutional and Graph Neural Networks for Text Representation2025-07-10Fuzzy Classification Aggregation for a Continuum of Agents2025-07-06Hybrid-View Attention for csPCa Classification in TRUS2025-07-04