TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Sparse Autoencoder

Sparse Autoencoder

Computer VisionIntroduced 200057 papers

Description

A Sparse Autoencoder is a type of autoencoder that employs sparsity to achieve an information bottleneck. Specifically the loss function is constructed so that activations are penalized within a layer. The sparsity constraint can be imposed with L1 regularization or a KL divergence between expected average neuron activation to an ideal distribution ppp.

Image: Jeff Jordan. Read his blog post (click) for a detailed summary of autoencoders.

Papers Using This Method

Bridging Compositional and Distributional Semantics: A Survey on Latent Semantic Geometry via AutoEncoder2025-06-25CWGAN-GP Augmented CAE for Jamming Detection in 5G-NR in Non-IID Datasets2025-06-18Resa: Transparent Reasoning Models via SAEs2025-06-11Model Unlearning via Sparse Autoencoder Subspace Guided Projections2025-05-30SAE-FiRE: Enhancing Earnings Surprise Predictions Through Sparse Autoencoder Feature Selection2025-05-20Are Sparse Autoencoders Useful for Java Function Bug Detection?2025-05-15Interpretable Risk Mitigation in LLM Agent Systems2025-05-15Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders2025-05-12Decoding Futures Price Dynamics: A Regularized Sparse Autoencoder for Interpretable Multi-Horizon Forecasting and Factor Discovery2025-05-11Geospatial Mechanistic Interpretability of Large Language Models2025-05-06FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation2025-05-01Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition2025-04-29Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video2025-04-28A real-time anomaly detection method for robots based on a flexible and sparse latent space2025-04-15Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability2025-03-26Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models2025-03-12Route Sparse Autoencoder to Interpret Large Language Models2025-03-11Self-Regularization with Latent Space Explanations for Controllable LLM-based Classification2025-02-19LLM Pretraining with Continuous Concepts2025-02-12Sparse Autoencoders for Hypothesis Generation2025-02-05