TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Experience Replay

Experience Replay

Reinforcement LearningIntroduced 1993865 papers

Description

Experience Replay is a replay memory technique used in reinforcement learning where we store the agent’s experiences at each time-step, e_t=(s_t,a_t,r_t,s_t+1)e\_{t} = \left(s\_{t}, a\_{t}, r\_{t}, s\_{t+1}\right)e_t=(s_t,a_t,r_t,s_t+1) in a data-set D=e_1,⋯ ,e_ND = e\_{1}, \cdots, e\_{N}D=e_1,⋯,e_N , pooled over many episodes into a replay memory. We then usually sample the memory randomly for a minibatch of experience, and use this to learn off-policy, as with Deep Q-Networks. This tackles the problem of autocorrelation leading to unstable training, by making the problem more like a supervised learning problem.

Image Credit: Hands-On Reinforcement Learning with Python, Sudharsan Ravichandiran

Papers Using This Method

Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound2025-07-15Deep Reinforcement Learning with Gradient Eligibility Traces2025-07-12Multi-Objective Reinforcement Learning for Cognitive Radar Resource Management2025-06-25CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization2025-06-18Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning2025-06-08Contextual Experience Replay for Self-Improvement of Language Agents2025-06-07Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning2025-06-06A Novel Deep Reinforcement Learning Method for Computation Offloading in Multi-User Mobile Edge Computing with Decentralization2025-06-03FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control2025-05-28Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models2025-05-23LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models2025-05-21LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners2025-05-17Unveiling the Black Box: A Multi-Layer Framework for Explaining Reinforcement Learning-Based Cyber Agents2025-05-16Electric Bus Charging Schedules Relying on Real Data-Driven Targets Based on Hierarchical Deep Reinforcement Learning2025-05-15GradMix: Gradient-based Selective Mixup for Robust Data Augmentation in Class-Incremental Learning2025-05-13Monte Carlo Beam Search for Actor-Critic Reinforcement Learning in Continuous Control2025-05-13Deep reinforcement learning-based longitudinal control strategy for automated vehicles at signalised intersections2025-05-13Online Learning-based Adaptive Beam Switching for 6G Networks: Enhancing Efficiency and Resilience2025-05-12Energy Efficient RSMA-Based LEO Satellite Communications Assisted by UAV-Mounted BD-Active RIS: A DRL Approach2025-05-07A Goal-Oriented Reinforcement Learning-Based Path Planning Algorithm for Modular Self-Reconfigurable Satellites2025-05-04