TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Disentangled Attribution Curves for Interpreting Random Fo...

Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees

Summer Devlin, Chandan Singh, W. James Murdoch, Bin Yu

2019-05-18Feature EngineeringInterpretable Machine LearningFeature Importance
PaperPDFCodeCode(official)CodeCode

Abstract

Tree ensembles, such as random forests and AdaBoost, are ubiquitous machine learning models known for achieving strong predictive performance across a wide variety of domains. However, this strong performance comes at the cost of interpretability (i.e. users are unable to understand the relationships a trained random forest has learned and why it is making its predictions). In particular, it is challenging to understand how the contribution of a particular feature, or group of features, varies as their value changes. To address this, we introduce Disentangled Attribution Curves (DAC), a method to provide interpretations of tree ensemble methods in the form of (multivariate) feature importance curves. For a given variable, or group of variables, DAC plots the importance of a variable(s) as their value changes. We validate DAC on real data by showing that the curves can be used to increase the accuracy of logistic regression while maintaining interpretability, by including DAC as an additional feature. In simulation studies, DAC is shown to out-perform competing methods in the recovery of conditional expectations. Finally, through a case-study on the bike-sharing dataset, we demonstrate the use of DAC to uncover novel insights into a dataset.

Related Papers

MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Neural Network-Guided Symbolic Regression for Interpretable Descriptor Discovery in Perovskite Catalysts2025-07-16SentiDrop: A Multi Modal Machine Learning model for Predicting Dropout in Distance Learning2025-07-14Feature-Guided Neighbor Selection for Non-Expert Evaluation of Model Predictions2025-07-08Advancing Magnetic Materials Discovery -- A structure-based machine learning approach for magnetic ordering and magnetic moment prediction2025-07-02Prompt Mechanisms in Medical Imaging: A Comprehensive Survey2025-06-28Quantum Reinforcement Learning Trading Agent for Sector Rotation in the Taiwan Stock Market2025-06-26Temporal-Aware Graph Attention Network for Cryptocurrency Transaction Fraud Detection2025-06-26