Description
D4PG, or Distributed Distributional DDPG, is a policy gradient algorithm that extends upon the DDPG. The improvements include a distributional updates to the DDPG algorithm, combined with the use of multiple distributed workers all writing into the same replay table. The biggest performance gain of other simpler changes was the use of -step returns. The authors found that the use of prioritized experience replay was less crucial to the overall D4PG algorithm especially on harder problems.
Papers Using This Method
Learning in complex action spaces without policy gradients2024-10-08Mitigating Estimation Errors by Twin TD-Regularized Actor and Critic for Deep Reinforcement Learning2023-11-07SDGym: Low-Code Reinforcement Learning Environments using System Dynamics Models2023-10-19A Long $N$-step Surrogate Stage Reward for Deep Reinforcement Learning2023-09-21Gamma and Vega Hedging Using Deep Distributional Reinforcement Learning2022-05-10Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach2022-04-21Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking2020-11-15Distributed Uplink Beamforming in Cell-Free Networks Using Deep Reinforcement Learning2020-06-26Sample-based Distributional Policy Gradient2020-01-08TF-Replicator: Distributed Machine Learning for Researchers2019-02-01Distributed Distributional Deterministic Policy Gradients2018-04-23