TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/TD-Suite: All Batteries Included Framework for Technical D...

TD-Suite: All Batteries Included Framework for Technical Debt Classification

Karthik Shivashankar, Antonio Martini

2025-04-15Binary ClassificationNatural Language UnderstandingManagementAll
PaperPDFCode(official)

Abstract

Recognizing that technical debt is a persistent and significant challenge requiring sophisticated management tools, TD-Suite offers a comprehensive software framework specifically engineered to automate the complex task of its classification within software projects. It leverages the advanced natural language understanding of state-of-the-art transformer models to analyze textual artifacts, such as developer discussions in issue reports, where subtle indicators of debt often lie hidden. TD-Suite provides a seamless end-to-end pipeline, managing everything from initial data ingestion and rigorous preprocessing to model training, thorough evaluation, and final inference. This allows it to support both straightforward binary classification (debt or no debt) and more valuable, identifying specific categories like code, design, or documentation debt, thus enabling more targeted management strategies. To ensure the generated models are robust and perform reliably on real-world, often imbalanced, datasets, TD-Suite incorporates critical training methodologies: k-fold cross-validation assesses generalization capability, early stopping mechanisms prevent overfitting to the training data, and class weighting strategies effectively address skewed data distributions. Beyond core functionality, and acknowledging the growing importance of sustainability, the framework integrates tracking and reporting of carbon emissions associated with the computationally intensive model training process. It also features a user-friendly Gradio web interface in a Docker container setup, simplifying model interaction, evaluation, and inference.

Related Papers

Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17Modeling Code: Is Text All You Need?2025-07-15All Eyes, no IMU: Learning Flight Attitude from Vision Alone2025-07-15Vision Language Action Models in Robotic Manipulation: A Systematic Review2025-07-14An Automated Classifier of Harmful Brain Activities for Clinical Usage Based on a Vision-Inspired Pre-trained Framework2025-07-10Unpatchable Vulnerabilities in Windows 10/11: Security Report 20252025-07-10DT4PCP: A Digital Twin Framework for Personalized Care Planning Applied to Type 2 Diabetes Management2025-07-10