Exploration by Random Network Distillation

Yuri Burda, Harrison Edwards, Amos Storkey, Oleg Klimov

2018-10-30ICLR 2019Unsupervised Reinforcement Learning Reinforcement Learning Atari Games Montezuma's Revenge reinforcement-learning

Paper PDF Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code(official)Code

Abstract

We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations given by a fixed randomly initialized neural network. We also introduce a method to flexibly combine intrinsic and extrinsic rewards. We find that the random network distillation (RND) bonus combined with this increased flexibility enables significant progress on several hard exploration Atari games. In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods. To the best of our knowledge, this is the first method that achieves better than average human performance on this game without using demonstrations or having access to the underlying state of the game, and occasionally completes the first level.

Results

Task	Dataset	Metric	Value	Model
Atari Games	Atari 2600 Montezuma's Revenge	Score	8152	RND
Atari Games	Atari 2600 Gravitar	Score	3906	RND
Atari Games	Atari 2600 Pitfall!	Score	-3	RND
Atari Games	Atari 2600 Solaris	Score	3282	RND
Atari Games	Atari 2600 Venture	Score	1859	RND
Atari Games	Atari 2600 Private Eye	Score	8666	RND
Video Games	Atari 2600 Montezuma's Revenge	Score	8152	RND
Video Games	Atari 2600 Gravitar	Score	3906	RND
Video Games	Atari 2600 Pitfall!	Score	-3	RND
Video Games	Atari 2600 Solaris	Score	3282	RND
Video Games	Atari 2600 Venture	Score	1859	RND
Video Games	Atari 2600 Private Eye	Score	8666	RND

Exploration by Random Network Distillation

Abstract

Results

Related Papers

Exploration by Random Network Distillation

Abstract

Results

Related Papers