Description
Network Dissection is an interpretability method for CNNs that evaluates the alignment between individual hidden units and a set of visual semantic concepts. By identifying the best alignments, units are given human interpretable labels across a range of objects, parts, scenes, textures, materials, and colors.
The measurement of interpretability proceeds in three steps:
- Identify a broad set of human-labeled visual concepts.
- Gather the response of the hidden variables to known concepts.
- Quantify alignment of hidden variable−concept pairs.
Papers Using This Method
Labeling Neural Representations with Inverse Recognition2023-11-22DISCOVER: Making Vision Networks Interpretable via Competition and Dissection2023-09-21On the Impact of Knowledge Distillation for Model Interpretability2023-05-25Detection Accuracy for Evaluating Compositional Explanations of Units2021-09-16Interpreting Face Inference Models using Hierarchical Network Dissection2021-08-23Gated Convolutional Networks with Hybrid Connectivity for Image Classification2019-08-26Interpreting Adversarial Examples by Activation Promotion and Suppression2019-04-03On the Units of GANs (Extended Abstract)2019-01-29GAN Dissection: Visualizing and Understanding Generative Adversarial Networks2018-11-26How convolutional neural network see the world - A survey of convolutional neural network visualization methods2018-04-30Interpreting Deep Visual Representations via Network Dissection2017-11-15