Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

DPG

Deterministic Policy Gradient

Reinforcement LearningIntroduced 201420 papers

Description

Deterministic Policy Gradient, or DPG, is a policy gradient method for reinforcement learning. Instead of the policy function $\pi\left(.\mid{s}\right)$ being modeled as a probability distribution, DPG considers and calculates gradients for a deterministic policy $a = \mu\_{theta}\left(s\right)$ .

Papers Using This Method

DPG loss functions for learning parameter-to-solution maps by neural networks2025-06-23 ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL2025-05-30 Harnessing Caption Detailness for Data-Efficient Text-to-Image Generation2025-05-21 DIP-Watermark: A Double Identity Protection Method Based on Robust Adversarial Watermark2024-04-23 Dynamic Generation of Personalities with Large Language Models2024-04-10 Decision Predicate Graphs: Enhancing Interpretability in Tree Ensembles2024-04-03 BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback2024-02-04 Dual-based Online Learning of Dynamic Network Topologies2022-11-14 Dynamic Sparse R-CNN2022-05-04 Controlling Conditional Language Models without Catastrophic Forgetting2021-12-01 MPC-based Reinforcement Learning for a Simplified Freight Mission of Autonomous Surface Vehicles2021-06-16 Deep Reinforcement Agent for Scheduling in HPC2021-02-11 A review of motion planning algorithms for intelligent robotics2021-02-04 OffCon$^3$: What is state of the art anyway?2021-01-27 Zeroth-order Deterministic Policy Gradient2020-06-12 Investigation on the generalization of the Sampled Policy Gradient algorithm2019-10-09 Sampled Policy Gradient for Learning to Play the Game Agar.io2018-09-15 Directed Policy Gradient for Safe Reinforcement Learning with Human Advice2018-08-13 Deterministic Policy Gradients With General State Transitions2018-07-10 Expected Policy Gradients2017-06-15