TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Revisiting End-to-End Learning with Slide-level Supervisio...

Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology

Wenhao Tang, Rong Qin, Heng Fang, Fengtao Zhou, Hao Chen, Xiang Li, Ming-Ming Cheng

2025-06-03Multiple Instance Learning
PaperPDFCodeCode(official)

Abstract

Pre-trained encoders for offline feature extraction followed by multiple instance learning (MIL) aggregators have become the dominant paradigm in computational pathology (CPath), benefiting cancer diagnosis and prognosis. However, performance limitations arise from the absence of encoder fine-tuning for downstream tasks and disjoint optimization with MIL. While slide-level supervised end-to-end (E2E) learning is an intuitive solution to this issue, it faces challenges such as high computational demands and suboptimal results. These limitations motivate us to revisit E2E learning. We argue that prior work neglects inherent E2E optimization challenges, leading to performance disparities compared to traditional two-stage methods. In this paper, we pioneer the elucidation of optimization challenge caused by sparse-attention MIL and propose a novel MIL called ABMILX. It mitigates this problem through global correlation-based attention refinement and multi-head mechanisms. With the efficient multi-scale random patch sampling strategy, an E2E trained ResNet with ABMILX surpasses SOTA foundation models under the two-stage paradigm across multiple challenging benchmarks, while remaining computationally efficient (<10 RTX3090 hours). We show the potential of E2E learning in CPath and calls for greater research focus in this area. The code is https://github.com/DearCaat/E2E-WSI-ABMILX.

Related Papers

GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning2025-07-09The Trilemma of Truth in Large Language Models2025-06-30OTSurv: A Novel Multiple Instance Learning Framework for Survival Prediction with Heterogeneity-aware Optimal Transport2025-06-25Benchmarking histopathology foundation models in a multi-center dataset for skin cancer subtyping2025-06-23MiCo: Multiple Instance Learning with Context-Aware Clustering for Whole Slide Image Analysis2025-06-22HyperPath: Knowledge-Guided Hyperbolic Semantic Hierarchy Modeling for WSI Analysis2025-06-19Dual‑detector Re‑optimization for Federated Weakly Supervised Video Anomaly Detection Via Adaptive Dynamic Recursive Mapping2025-06-13BioLangFusion: Multimodal Fusion of DNA, mRNA, and Protein Language Models2025-06-10