TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Curvature-based Feature Selection with Application in Clas...

Curvature-based Feature Selection with Application in Classifying Electronic Health Records

Zheming Zuo, Jie Li, Han Xu, Noura Al Moubayed

2021-01-10Dimensionality ReductionDiabetic Retinopathy Detectionfeature selectionBreast Cancer Detection
PaperPDFCode(official)

Abstract

Disruptive technologies provides unparalleled opportunities to contribute to the identifications of many aspects in pervasive healthcare, from the adoption of the Internet of Things through to Machine Learning (ML) techniques. As a powerful tool, ML has been widely applied in patient-centric healthcare solutions. To further improve the quality of patient care, Electronic Health Records (EHRs) are commonly adopted in healthcare facilities for analysis. It is a crucial task to apply AI and ML to analyse those EHRs for prediction and diagnostics due to their highly unstructured, unbalanced, incomplete, and high-dimensional nature. Dimensionality reduction is a common data preprocessing technique to cope with high-dimensional EHR data, which aims to reduce the number of features of EHR representation while improving the performance of the subsequent data analysis, e.g. classification. In this work, an efficient filter-based feature selection method, namely Curvature-based Feature Selection (CFS), is presented. The proposed CFS applied the concept of Menger Curvature to rank the weights of all features in the given data set. The performance of the proposed CFS has been evaluated in four well-known EHR data sets, including Cervical Cancer Risk Factors (CCRFDS), Breast Cancer Coimbra (BCCDS), Breast Tissue (BTDS), and Diabetic Retinopathy Debrecen (DRDDS). The experimental results show that the proposed CFS achieved state-of-the-art performance on the above data sets against conventional PCA and other most recent approaches. The source code of the proposed approach is publicly available at https://github.com/zhemingzuo/CFS.

Results

TaskDatasetMetricValueModel
CancerBreast Cancer Coimbra Data SetMean Accuracy79.17CFS-TSK+
Diabetic Retinopathy DetectionDiabetic Retinopathy Debrecen Data SetMean Accuracy74.72CFS-BPNN
Cervical cancer biopsy identificationCervical Cancer (Risk Factors) Data SetMean Accuracy97.09CFS-TSK+
Breast Tissue IdentificationBreast Tissue Data SetMean Accuracy100CFS-QDA
Breast Cancer Histology Image ClassificationBreast Cancer Coimbra Data SetMean Accuracy79.17CFS-TSK+

Related Papers

mNARX+: A surrogate model for complex dynamical systems using manifold-NARX and automatic feature selection2025-07-17Interpretable Bayesian Tensor Network Kernel Machines with Automatic Rank and Feature Selection2025-07-15Lightweight Model for Poultry Disease Detection from Fecal Images Using Multi-Color Space Feature Optimization and Machine Learning2025-07-14Hierarchical Interaction Summarization and Contrastive Prompting for Explainable Recommendations2025-07-08From Motion to Meaning: Biomechanics-Informed Neural Network for Explainable Cardiovascular Disease Identification2025-07-08Active Learning for Manifold Gaussian Process Regression2025-06-26Empowering Digital Agriculture: A Privacy-Preserving Framework for Data Sharing and Collaborative Research2025-06-25Distributed Lyapunov Functions for Nonlinear Networks2025-06-25