Atari Games on Atari 2600 Bank Heist

Metric: Score (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	Score▼	Extra Data	Paper	Date↕	Code
1	MuZero (Res2 Adam)	27219.8	No	Online and Offline Reinforcement Learning by Pla...	2021-04-13	Code
2	R2D2	24235.9	No	-	-	Code
3	Agent57	23071.5	No	Agent57: Outperforming the Atari Human Benchmark	2020-03-30	Code
4	Ape-X	1716.4	No	Distributed Prioritized Experience Replay	2018-03-02	Code
5	Duel noop	1611.9	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
6	Prior+Duel noop	1503.1	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
7	IQN	1416	No	Implicit Quantile Networks for Distributional Re...	2018-06-14	Code
8	GDI-I3	1401	No	Generalized Data Distribution Iteration	2022-06-07	-
9	GDI-H3	1380	No	Generalized Data Distribution Iteration	2022-06-07	-
10	ASL DDQN	1340.9	No	Train a Real-world Local Path Planner in One Hou...	2023-05-07	Code
11	NoisyNet-Dueling	1318	No	Noisy Networks for Exploration	2017-06-30	Code
12	DNA	1286	No	DNA: Proximal Policy Optimization with a Dual Ne...	2022-06-20	Code
13	MuZero	1278.98	No	Mastering Atari, Go, Chess and Shogi by Planning...	2019-11-19	Code
14	Reactor 500M	1259.7	No	The Reactor: A fast and sample-efficient Actor-C...	2017-04-15	-
15	QR-DQN-1	1249	No	Distributional Reinforcement Learning with Quant...	2017-10-27	Code
16	IMPALA (deep)	1223.15	No	IMPALA: Scalable Distributed Deep-RL with Import...	2018-02-05	Code
17	POP3D	1212.23	No	Policy Optimization With Penalized Point Probabi...	2018-07-02	Code
18	Bootstrapped DQN	1208	No	Deep Exploration via Bootstrapped DQN	2016-02-15	Code
19	A2C + SIL	1137.8	No	Self-Imitation Learning	2018-06-14	Code
20	Duel hs	1129.3	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
21	DreamerV2	1126	No	Mastering Atari with Discrete World Models	2020-10-05	Code
22	DDQN+Pop-Art noop	1103.3	No	Learning values across many orders of magnitude	2016-02-24	-
23	Prior noop	1054.6	No	Prioritized Experience Replay	2015-11-18	Code
24	DDQN (tuned) noop	1030.6	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
25	Prior+Duel hs	1004.6	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
26	C51 noop	976	No	A Distributional Perspective on Reinforcement Le...	2017-07-21	Code
27	A3C FF hs	970.1	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
28	A3C FF (1 day) hs	946	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
29	A3C LSTM hs	932.8	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
30	DDQN (tuned) hs	886	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
31	Prior hs	876.6	No	Prioritized Experience Replay	2015-11-18	Code
32	Persistent AL	874.99	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
33	Advantage Learning	633.63	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
34	UCT	497.8	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
35	DQN noop	455	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
36	Nature DQN	429.7	No	-	-	Code
37	Gorila	399.4	No	Massively Parallel Methods for Deep Reinforcemen...	2015-07-15	Code
38	DQN hs	312.7	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
39	Rainbow+SEER	276.6	No	Improving Computational Efficiency in Visual Rei...	2021-03-04	Code
40	ES FF (1 hour) noop	225	No	Evolution Strategies as a Scalable Alternative t...	2017-03-10	Code
41	CURL	193.7	No	CURL: Contrastive Unsupervised Representations f...	2020-04-08	Code
42	Best Learner	190.8	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
43	CGP	148	No	Evolving simple programs for playing Atari games	2018-06-14	Code
44	Discrete Latent Space World Model (VQ-VAE)	121.6	No	Smaller World Models for Reinforcement Learning	2020-10-12	-
45	SARSA	67.4	No	-	-	-

#1MuZero (Res2 Adam)SOTA
27219.8
Score· 2021-04-13
Online and Offline Reinforcement Learning by Planning with a Learned Model Code
#2R2D2
24235.9
Score
No paperCode
#3Agent57SOTA
23071.5
Score· 2020-03-30
Agent57: Outperforming the Atari Human Benchmark Code
#4Ape-XSOTA
1716.4
Score· 2018-03-02
Distributed Prioritized Experience Replay Code
#5Duel noopSOTA
1611.9
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#6Prior+Duel noop
1503.1
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#7IQN
1416
Score· 2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning Code
#8GDI-I3
1401
Score· 2022-06-07
Generalized Data Distribution Iteration
#9GDI-H3
1380
Score· 2022-06-07
Generalized Data Distribution Iteration
#10ASL DDQN
1340.9
Score· 2023-05-07
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity Code
#11NoisyNet-Dueling
1318
Score· 2017-06-30
Noisy Networks for Exploration Code
#12DNA
1286
Score· 2022-06-20
DNA: Proximal Policy Optimization with a Dual Network Architecture Code
#13MuZero
1278.98
Score· 2019-11-19
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model Code
#14Reactor 500M
1259.7
Score· 2017-04-15
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
#15QR-DQN-1
1249
Score· 2017-10-27
Distributional Reinforcement Learning with Quantile Regression Code
#16IMPALA (deep)
1223.15
Score· 2018-02-05
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Code
#17POP3D
1212.23
Score· 2018-07-02
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization Code
#18Bootstrapped DQN
1208
Score· 2016-02-15
Deep Exploration via Bootstrapped DQN Code
#19A2C + SIL
1137.8
Score· 2018-06-14
Self-Imitation Learning Code
#20Duel hs
1129.3
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#21DreamerV2
1126
Score· 2020-10-05
Mastering Atari with Discrete World Models Code
#22DDQN+Pop-Art noop
1103.3
Score· 2016-02-24
Learning values across many orders of magnitude
#23Prior noopSOTA
1054.6
Score· 2015-11-18
Prioritized Experience Replay Code
#24DDQN (tuned) noop
1030.6
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#25Prior+Duel hsSOTA
1004.6
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#26C51 noop
976
Score· 2017-07-21
A Distributional Perspective on Reinforcement Learning Code
#27A3C FF hs
970.1
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#28A3C FF (1 day) hs
946
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#29A3C LSTM hs
932.8
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#30DDQN (tuned) hs
886
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#31Prior hs
876.6
Score· 2015-11-18
Prioritized Experience Replay Code
#32Persistent AL
874.99
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#33Advantage Learning
633.63
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#34UCTSOTA
497.8
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#35DQN noop
455
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#36Nature DQN
429.7
Score
No paperCode
#37Gorila
399.4
Score· 2015-07-15
Massively Parallel Methods for Deep Reinforcement Learning Code
#38DQN hs
312.7
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#39Rainbow+SEER
276.6
Score· 2021-03-04
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings Code
#40ES FF (1 hour) noop
225
Score· 2017-03-10
Evolution Strategies as a Scalable Alternative to Reinforcement Learning Code
#41CURL
193.7
Score· 2020-04-08
CURL: Contrastive Unsupervised Representations for Reinforcement Learning Code
#42Best Learner
190.8
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#43CGP
148
Score· 2018-06-14
Evolving simple programs for playing Atari games Code
#44Discrete Latent Space World Model (VQ-VAE)
121.6
Score· 2020-10-12
Smaller World Models for Reinforcement Learning
#45SARSA
67.4
Score
No paper