TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Count-Based Exploration with Neural Density Models

Count-Based Exploration with Neural Density Models

Georg Ostrovski, Marc G. Bellemare, Aaron van den Oord, Remi Munos

2017-03-03ICML 2017 8Reinforcement LearningAtari GamesMontezuma's Revenge
PaperPDFCode

Abstract

Bellemare et al. (2016) introduced the notion of a pseudo-count, derived from a density model, to generalize count-based exploration to non-tabular reinforcement learning. This pseudo-count was used to generate an exploration bonus for a DQN agent and combined with a mixed Monte Carlo update was sufficient to achieve state of the art on the Atari 2600 game Montezuma's Revenge. We consider two questions left open by their work: First, how important is the quality of the density model for exploration? Second, what role does the Monte Carlo update play in exploration? We answer the first question by demonstrating the use of PixelCNN, an advanced neural density model for images, to supply a pseudo-count. In particular, we examine the intrinsic difficulties in adapting Bellemare et al.'s approach when assumptions about the model are violated. The result is a more practical and general algorithm requiring no special apparatus. We combine PixelCNN pseudo-counts with different agent architectures to dramatically improve the state of the art on several hard Atari games. One surprising finding is that the mixed Monte Carlo update is a powerful facilitator of exploration in the sparsest of settings, including Montezuma's Revenge.

Results

TaskDatasetMetricValueModel
Atari GamesAtari 2600 FreewayScore33DQN-CTS
Atari GamesAtari 2600 FreewayScore31.7DQN-PixelCNN
Atari GamesAtari 2600 Montezuma's RevengeScore3705.5DQN-PixelCNN
Atari GamesAtari 2600 GravitarScore498.3DQN-PixelCNN
Atari GamesAtari 2600 GravitarScore238DQN-CTS
Atari GamesAtari 2600 VentureScore82.2DQN-PixelCNN
Atari GamesAtari 2600 VentureScore48DQN-CTS
Atari GamesAtari 2600 Private EyeScore8358.7DQN-PixelCNN
Atari GamesAtari 2600 Private EyeScore206DQN-CTS
Video GamesAtari 2600 FreewayScore33DQN-CTS
Video GamesAtari 2600 FreewayScore31.7DQN-PixelCNN
Video GamesAtari 2600 Montezuma's RevengeScore3705.5DQN-PixelCNN
Video GamesAtari 2600 GravitarScore498.3DQN-PixelCNN
Video GamesAtari 2600 GravitarScore238DQN-CTS
Video GamesAtari 2600 VentureScore82.2DQN-PixelCNN
Video GamesAtari 2600 VentureScore48DQN-CTS
Video GamesAtari 2600 Private EyeScore8358.7DQN-PixelCNN
Video GamesAtari 2600 Private EyeScore206DQN-CTS

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17