TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Soft Actor Critic

Soft Actor Critic

Reinforcement LearningIntroduced 200058 papers
Source Paper

Description

Soft Actor Critic, or SAC, is an off-policy actor-critic deep RL algorithm based on the maximum entropy reinforcement learning framework. In this framework, the actor aims to maximize expected reward while also maximizing entropy. That is, to succeed at the task while acting as randomly as possible. Prior deep RL methods based on this framework have been formulated as Q-learning methods. SAC combines off-policy updates with a stable stochastic actor-critic formulation.

The SAC objective has a number of advantages. First, the policy is incentivized to explore more widely, while giving up on clearly unpromising avenues. Second, the policy can capture multiple modes of near-optimal behavior. In problem settings where multiple actions seem equally attractive, the policy will commit equal probability mass to those actions. Lastly, the authors present evidence that it improves learning speed over state-of-art methods that optimize the conventional RL objective function.

Papers Using This Method

Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss2025-04-14Closing the Intent-to-Behavior Gap via Fulfillment Priority Logic2025-03-04IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic2025-02-27Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning2025-01-29Reinforcement Learning Controlled Adaptive PSO for Task Offloading in IIoT Edge Computing2025-01-25Average Reward Reinforcement Learning for Wireless Radio Resource Management2025-01-12Learn 2 Rage: Experiencing The Emotional Roller Coaster That Is Reinforcement Learning2024-10-24Augmented Lagrangian-Based Safe Reinforcement Learning Approach for Distribution System Volt/VAR Control2024-10-19Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach2024-10-15Deep Attention Driven Reinforcement Learning (DAD-RL) for Autonomous Decision-Making in Dynamic Environment2024-07-12Real-time system optimal traffic routing under uncertainties -- Can physics models boost reinforcement learning?2024-07-10Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation2024-07-08A fast balance optimization approach for charging enhancement of lithium-ion battery packs through deep reinforcement learning2024-04-24Imitation Game: A Model-based and Imitation Learning Deep Reinforcement Learning Hybrid2024-04-02K-percent Evaluation for Lifelong RL2024-04-02Deep Reinforcement Learning for Local Path Following of an Autonomous Formula SAE Vehicle2024-01-05Dynamic Fairness-Aware Spectrum Auction for Enhanced Licensed Shared Access in 6G Networks2023-12-20On Designing Multi-UAV aided Wireless Powered Dynamic Communication via Hierarchical Deep Reinforcement Learning2023-12-13Joint Sensing and Communication Optimization in Target-Mounted STARS-Assisted Vehicular Networks: A MADRL Approach2023-11-17Belief Projection-Based Reinforcement Learning for Environments with Delayed Feedback2023-09-21