TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/DPG

DPG

Deterministic Policy Gradient

Reinforcement LearningIntroduced 201420 papers

Description

Deterministic Policy Gradient, or DPG, is a policy gradient method for reinforcement learning. Instead of the policy function π(.∣s)\pi\left(.\mid{s}\right)π(.∣s) being modeled as a probability distribution, DPG considers and calculates gradients for a deterministic policy a=μ_theta(s)a = \mu\_{theta}\left(s\right)a=μ_theta(s).

Papers Using This Method

DPG loss functions for learning parameter-to-solution maps by neural networks2025-06-23ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL2025-05-30Harnessing Caption Detailness for Data-Efficient Text-to-Image Generation2025-05-21DIP-Watermark: A Double Identity Protection Method Based on Robust Adversarial Watermark2024-04-23Dynamic Generation of Personalities with Large Language Models2024-04-10Decision Predicate Graphs: Enhancing Interpretability in Tree Ensembles2024-04-03BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback2024-02-04Dual-based Online Learning of Dynamic Network Topologies2022-11-14Dynamic Sparse R-CNN2022-05-04Controlling Conditional Language Models without Catastrophic Forgetting2021-12-01MPC-based Reinforcement Learning for a Simplified Freight Mission of Autonomous Surface Vehicles2021-06-16Deep Reinforcement Agent for Scheduling in HPC2021-02-11A review of motion planning algorithms for intelligent robotics2021-02-04OffCon$^3$: What is state of the art anyway?2021-01-27Zeroth-order Deterministic Policy Gradient2020-06-12Investigation on the generalization of the Sampled Policy Gradient algorithm2019-10-09Sampled Policy Gradient for Learning to Play the Game Agar.io2018-09-15Directed Policy Gradient for Safe Reinforcement Learning with Human Advice2018-08-13Deterministic Policy Gradients With General State Transitions2018-07-10Expected Policy Gradients2017-06-15