Stochastic Dueling Network

Reinforcement LearningIntroduced 200012 papers

Description

A Stochastic Dueling Network, or SDN, is an architecture for learning a value function VV. The SDN learns both VV and QQ off-policy while maintaining consistency between the two estimates. At each time step it outputs a stochastic estimate of QQ and a deterministic estimate of VV.

Papers Using This Method