TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Exploration by Random Network Distillation

Exploration by Random Network Distillation

Yuri Burda, Harrison Edwards, Amos Storkey, Oleg Klimov

2018-10-30ICLR 2019Unsupervised Reinforcement LearningReinforcement LearningAtari GamesMontezuma's Revengereinforcement-learning
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode(official)Code

Abstract

We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations given by a fixed randomly initialized neural network. We also introduce a method to flexibly combine intrinsic and extrinsic rewards. We find that the random network distillation (RND) bonus combined with this increased flexibility enables significant progress on several hard exploration Atari games. In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods. To the best of our knowledge, this is the first method that achieves better than average human performance on this game without using demonstrations or having access to the underlying state of the game, and occasionally completes the first level.

Results

TaskDatasetMetricValueModel
Atari GamesAtari 2600 Montezuma's RevengeScore8152RND
Atari GamesAtari 2600 GravitarScore3906RND
Atari GamesAtari 2600 Pitfall!Score-3RND
Atari GamesAtari 2600 SolarisScore3282RND
Atari GamesAtari 2600 VentureScore1859RND
Atari GamesAtari 2600 Private EyeScore8666RND
Video GamesAtari 2600 Montezuma's RevengeScore8152RND
Video GamesAtari 2600 GravitarScore3906RND
Video GamesAtari 2600 Pitfall!Score-3RND
Video GamesAtari 2600 SolarisScore3282RND
Video GamesAtari 2600 VentureScore1859RND
Video GamesAtari 2600 Private EyeScore8666RND

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17