TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/A2C

A2C

Reinforcement LearningIntroduced 200082 papers
Source Paper

Description

A2C, or Advantage Actor Critic, is a synchronous version of the A3C policy gradient method. As an alternative to the asynchronous implementation of A3C, A2C is a synchronous, deterministic implementation that waits for each actor to finish its segment of experience before updating, averaging over all of the actors. This more effectively uses GPUs due to larger batch sizes.

Image Credit: OpenAI Baselines

Papers Using This Method

Monte Carlo Beam Search for Actor-Critic Reinforcement Learning in Continuous Control2025-05-13Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies2025-02-23Continuous Learning Conversational AI: A Personalized Agent Framework via A2C Reinforcement Learning2025-02-18EvoRL: A GPU-accelerated Framework for Evolutionary Reinforcement Learning2025-01-25Innate-Values-driven Reinforcement Learning based Cognitive Modeling2024-11-14Deep Reinforcement Learning Strategies in Finance: Insights into Asset Holding, Trading Behavior, and Purchase Diversity2024-06-29Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning2024-06-22Biological Neurons Compete with Deep Reinforcement Learning in Sample Efficiency in a Simulated Gameworld2024-05-27Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales2024-05-27Extracting Heuristics from Large Language Models for Reward Shaping in Reinforcement Learning2024-05-24Portfolio Management using Deep Reinforcement Learning2024-05-01Breaching the Bottleneck: Evolutionary Transition from Reward-Driven Learning to Reward-Agnostic Domain-Adapted Learning in Neuromodulated Neural Nets2024-04-19A2C: A Modular Multi-stage Collaborative Decision Framework for Human-AI Teams2024-01-25Epidemic Decision-making System Based Federated Reinforcement Learning2023-11-03Diagnosis-oriented Medical Image Compression with Efficient Transfer Learning2023-10-20Deep Reinforcement Learning-based Intelligent Traffic Signal Controls with Optimized CO2 emissions2023-10-19Solving the Quadratic Assignment Problem using Deep Reinforcement Learning2023-10-02Raijū: Reinforcement Learning-Guided Post-Exploitation for Automating Security Assessment of Network Systems2023-09-27SAF-Net: Self-Attention Fusion Network for Myocardial Infarction Detection using Multi-View Echocardiography2023-09-27Career Path Recommendations for Long-term Income Maximization: A Reinforcement Learning Approach2023-09-11