TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Multi-Class Abnormality Classification in Video Capsule En...

Multi-Class Abnormality Classification in Video Capsule Endoscopy Using Deep Learning

Arnav Samal, Ranya Batsyas

2024-10-24Multi-class Classification
PaperPDFCode(official)

Abstract

This report outlines Team Seq2Cure's deep learning approach for the Capsule Vision 2024 Challenge, leveraging an ensemble of convolutional neural networks (CNNs) and transformer-based architectures for multi-class abnormality classification in video capsule endoscopy frames. The dataset comprised over 50,000 frames from three public sources and one private dataset, labeled across 10 abnormality classes. To overcome the limitations of traditional CNNs in capturing global context, we integrated CNN and transformer models within a multi-model ensemble. Our approach achieved a balanced accuracy of 86.34 percent and a mean AUC-ROC score of 0.9908 on the validation set, earning our submission 5th place in the challenge. Code is available at http://github.com/arnavs04/capsule-vision-2024 .

Results

TaskDatasetMetricValueModel
ClassificationTraining and validation dataset of capsule vision 2024 challenge.Mean AUC0.9908Multi-Model Ensemble
Multi-class ClassificationTraining and validation dataset of capsule vision 2024 challenge.Mean AUC0.9908Multi-Model Ensemble

Related Papers

Detecting immune cells with label-free two-photon autofluorescence and deep learning2025-06-17SHORE: A Long-term User Lifetime Value Prediction Model in Digital Games2025-06-12FedFACT: A Provable Framework for Controllable Group-Fairness Calibration in Federated Learning2025-06-04GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models2025-05-30Multi-output Classification using a Cross-talk Architecture for Compound Fault Diagnosis of Motors in Partially Labeled Condition2025-05-29FinTagging: An LLM-ready Benchmark for Extracting and Structuring Financial Information2025-05-27Leveraging Cascaded Binary Classification and Multimodal Fusion for Dementia Detection through Spontaneous Speech2025-05-26Detection of Suicidal Risk on Social Media: A Hybrid Model2025-05-26