TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Dueling Network

Dueling Network

Reinforcement LearningIntroduced 200023 papers
Source Paper

Description

A Dueling Network is a type of Q-Network that has two streams to separately estimate (scalar) state-value and the advantages for each action. Both streams share a common convolutional feature learning module. The two streams are combined via a special aggregating layer to produce an estimate of the state-action value function Q as shown in the figure to the right.

The last module uses the following mapping:

Q(s,a,θ,α,β)=V(s,θ,β)+(A(s,a,θ,α)−1∣A∣∑_a′A(s,a′;θ,α))Q\left(s, a, \theta, \alpha, \beta\right) =V\left(s, \theta, \beta\right) + \left(A\left(s, a, \theta, \alpha\right) - \frac{1}{|\mathcal{A}|}\sum\_{a'}A\left(s, a'; \theta, \alpha\right)\right)Q(s,a,θ,α,β)=V(s,θ,β)+(A(s,a,θ,α)−∣A∣1​∑_a′A(s,a′;θ,α))

This formulation is chosen for identifiability so that the advantage function has zero advantage for the chosen action, but instead of a maximum we use an average operator to increase the stability of the optimization.

Papers Using This Method

Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop PC2024-11-06Active search and coverage using point-cloud reinforcement learning2023-12-18Deep Reinforcement Learning for Artificial Upwelling Energy Management2023-08-20Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks2022-09-16DNA: Proximal Policy Optimization with a Dual Network Architecture2022-06-20Deep Reinforcement Learning at the Edge of the Statistical Precipice2021-08-30A coevolutionary approach to deep multi-agent reinforcement learning2021-04-12Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates2021-01-01A State Representation Dueling Network for Deep Reinforcement Learning2020-12-24Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks & Game Theory2020-12-08A New Approach for Tactical Decision Making in Lane Changing: Sample Efficient Deep Q Learning with a Safety Feedback Reward2020-09-24QPLEX: Duplex Dueling Multi-Agent Q-Learning2020-08-03SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning2020-07-09Balancing a CartPole System with Reinforcement Learning -- A Tutorial2020-06-08To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies2019-09-01Generative Adversarial Imagination for Sample Efficient Deep Reinforcement Learning2019-04-30Macro action selection with deep reinforcement learning in StarCraft2018-12-02Distributed Prioritized Experience Replay2018-03-02Rainbow: Combining Improvements in Deep Reinforcement Learning2017-10-06Noisy Networks for Exploration2017-06-30