TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Replacing Eligibility Trace

Replacing Eligibility Trace

Reinforcement LearningIntroduced 2000

Description

In a Replacing Eligibility Trace, each time the state is revisited, the trace is reset to 111 regardless of the presence of a prior trace.. For the memory vector e_t∈Rb≥0\textbf{e}\_{t} \in \mathbb{R}^{b} \geq \textbf{0}e_t∈Rb≥0:

e_0=0\mathbf{e\_{0}} = \textbf{0}e_0=0

e_t=γλe_t−1(s) if s≠s_t\textbf{e}\_{t} = \gamma\lambda{e}\_{t-1}\left(s\right) \text{ if } s \neq s\_{t}e_t=γλe_t−1(s) if s=s_t

e_t=1 if s=s_t\textbf{e}\_{t} = 1 \text{ if } s = s\_{t}e_t=1 if s=s_t

They can be seen as crude approximation to dutch traces, which have largely superseded them as they perform better than replacing traces and have a clearer theoretical basis. Accumulating traces remain of interest for nonlinear function approximations where dutch traces are not available.

Source: Sutton and Barto, Reinforcement Learning, 2nd Edition