Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/N-step Returns

N-step Returns

Reinforcement LearningIntroduced 200029 papers

Description

$n$ -step Returns are used for value function estimation in reinforcement learning. Specifically, for $n$ steps we can write the complete return as:

$R\_{t}^{(n)} = r\_{t+1} + \gamma{r}\_{t+2} + \cdots + \gamma^{n-1}\_{t+n} + \gamma^{n}V\_{t}\left(s\_{t+n}\right)$

We can then write an $n$ -step backup, in the style of TD learning, as:

$\Delta{V}\_{r}\left(s\_{t}\right) = \alpha\left[R\_{t}^{(n)} - V\_{t}\left(s\_{t}\right)\right]$

Multi-step returns often lead to faster learning with suitably tuned $n$ .

Image Credit: Sutton and Barto, Reinforcement Learning

Papers Using This Method

Shapley Machine: A Game-Theoretic Framework for N-Agent Ad Hoc Teamwork2025-06-12 Chunking the Critic: A Transformer-based Soft Actor-Critic with N-Step Returns2025-03-05 Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop PC2024-11-06 Learning in complex action spaces without policy gradients2024-10-08 Mitigating Estimation Errors by Twin TD-Regularized Actor and Critic for Deep Reinforcement Learning2023-11-07 SDGym: Low-Code Reinforcement Learning Environments using System Dynamics Models2023-10-19 A Long $N$-step Surrogate Stage Reward for Deep Reinforcement Learning2023-09-21 Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks2022-09-16 DNA: Proximal Policy Optimization with a Dual Network Architecture2022-06-20 Gamma and Vega Hedging Using Deep Distributional Reinforcement Learning2022-05-10 Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach2022-04-21 Deep Reinforcement Learning at the Edge of the Statistical Precipice2021-08-30 A coevolutionary approach to deep multi-agent reinforcement learning2021-04-12 Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates2021-01-01 Adaptive N-step Bootstrapping with Off-policy Data2021-01-01 Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking2020-11-15 A New Approach for Tactical Decision Making in Lane Changing: Sample Efficient Deep Q Learning with a Safety Feedback Reward2020-09-24 Munchausen Reinforcement Learning2020-07-28 Revisiting Fundamentals of Experience Replay2020-07-13 SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning2020-07-09