Atari Games on Atari 2600 Frostbite

Metric: Score (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	Score▼	Extra Data	Paper	Date↕	Code
1	MuZero	631378.53	No	Mastering Atari, Go, Chess and Shogi by Planning...	2019-11-19	Code
2	Agent57	541280.88	No	Agent57: Outperforming the Atari Human Benchmark	2020-03-30	Code
3	MuZero (Res2 Adam)	374769.76	No	Online and Offline Reinforcement Learning by Pla...	2021-04-13	Code
4	R2D2	315456.4	No	-	-	Code
5	Fearlessmrx	214060	No	Fully Parameterized Quantile Function for Distri...	2019-11-05	Code
6	DreamerV2	11384	No	Mastering Atari with Discrete World Models	2020-10-05	Code
7	GDI-H3(200M frames)	11330	No	Generalized Data Distribution Iteration	2022-06-07	-
8	GDI-H3	11330	No	Generalized Data Distribution Iteration	2022-06-07	-
9	GDI-I3	10485	No	GDI: Rethinking What Makes Reinforcement Learnin...	2021-06-11	-
10	GDI-I3	10485	No	GDI: Rethinking What Makes Reinforcement Learnin...	2021-06-11	-
11	Ape-X	9328.6	No	Distributed Prioritized Experience Replay	2018-03-02	Code
12	ASL DDQN	8616.4	No	Train a Real-world Local Path Planner in One Hou...	2023-05-07	Code
13	Prior+Duel noop	7413	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
14	A2C + SIL	6289.8	No	Self-Imitation Learning	2018-06-14	Code
15	TRPO-hash	5214	No	#Exploration: A Study of Count-Based Exploration...	2016-11-15	Code
16	Duel noop	4672.8	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
17	QR-DQN-1	4384	No	Distributional Reinforcement Learning with Quant...	2017-10-27	Code
18	Prior noop	4380.1	No	Prioritized Experience Replay	2015-11-18	Code
19	IQN	4324	No	Implicit Quantile Networks for Distributional Re...	2018-06-14	Code
20	Prior+Duel hs	4038.4	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
21	C51 noop	3965	No	A Distributional Perspective on Reinforcement Le...	2017-07-21	Code
22	VPN	3811	No	Value Prediction Network	2017-07-11	Code
23	Prior hs	3510	No	Prioritized Experience Replay	2015-11-18	Code
24	DDQN+Pop-Art noop	3469.6	No	Learning values across many orders of magnitude	2016-02-24	-
25	Persistent AL	3248.96	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
26	NoisyNet-Dueling	2923	No	Noisy Networks for Exploration	2017-06-30	Code
27	Sarsa-φ-EB	2770.1	No	Count-Based Exploration in Feature Space for Rei...	2017-06-25	Code
28	MFEC	2394	No	Model-Free Episodic Control with State Aggregation	2020-08-21	-
29	Duel hs	2332.4	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
30	Advantage Learning	2305.82	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
31	Bootstrapped DQN	2181.4	No	Deep Exploration via Bootstrapped DQN	2016-02-15	Code
32	DDQN (tuned) noop	1683.3	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
33	DDQN (tuned) hs	1448.1	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
34	Sarsa-ε	1394.3	No	Count-Based Exploration in Feature Space for Rei...	2017-06-25	Code
35	CURL	924	No	CURL: Contrastive Unsupervised Representations f...	2020-04-08	Code
36	DQN noop	797.4	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
37	CGP	782	No	Evolving simple programs for playing Atari games	2018-06-14	Code
38	MP-EB	507	No	Incentivizing Exploration In Reinforcement Learn...	2015-07-03	Code
39	DQN hs	496.1	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
40	Gorila	426.6	No	Massively Parallel Methods for Deep Reinforcemen...	2015-07-15	Code
41	ES FF (1 hour) noop	370	No	Evolution Strategies as a Scalable Alternative t...	2017-03-10	Code
42	Nature DQN	328.3	No	-	-	Code
43	DNA	320	No	DNA: Proximal Policy Optimization with a Dual Ne...	2022-06-20	Code
44	IMPALA (deep)	317.75	No	IMPALA: Scalable Distributed Deep-RL with Import...	2018-02-05	Code
45	POP3D	316.87	No	Policy Optimization With Penalized Point Probabi...	2018-07-02	Code
46	IDVQ + DRSC + XNES	300	No	Playing Atari with Six Neurons	2018-06-04	Code
47	UCT	270.5	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
48	Best Learner	216.9	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
49	A3C LSTM hs	197.6	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
50	A3C FF hs	190.5	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
51	SARSA	180.9	No	-	-	-
52	A3C FF (1 day) hs	180.1	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
53	SAC	59.4	No	Soft Actor-Critic for Discrete Action Settings	2019-10-16	Code

#1MuZeroSOTA
631378.53
Score· 2019-11-19
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model Code
#2Agent57
541280.88
Score· 2020-03-30
Agent57: Outperforming the Atari Human Benchmark Code
#3MuZero (Res2 Adam)
374769.76
Score· 2021-04-13
Online and Offline Reinforcement Learning by Planning with a Learned Model Code
#4R2D2
315456.4
Score
No paperCode
#5FearlessmrxSOTA
214060
Score· 2019-11-05
Fully Parameterized Quantile Function for Distributional Reinforcement Learning Code
#6DreamerV2
11384
Score· 2020-10-05
Mastering Atari with Discrete World Models Code
#7GDI-H3(200M frames)
11330
Score· 2022-06-07
Generalized Data Distribution Iteration
#8GDI-H3
11330
Score· 2022-06-07
Generalized Data Distribution Iteration
#9GDI-I3
10485
Score· 2021-06-11
GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning
#10GDI-I3
10485
Score· 2021-06-11
GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning
#11Ape-XSOTA
9328.6
Score· 2018-03-02
Distributed Prioritized Experience Replay Code
#12ASL DDQN
8616.4
Score· 2023-05-07
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity Code
#13Prior+Duel noopSOTA
7413
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#14A2C + SIL
6289.8
Score· 2018-06-14
Self-Imitation Learning Code
#15TRPO-hash
5214
Score· 2016-11-15
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning Code
#16Duel noop
4672.8
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#17QR-DQN-1
4384
Score· 2017-10-27
Distributional Reinforcement Learning with Quantile Regression Code
#18Prior noopSOTA
4380.1
Score· 2015-11-18
Prioritized Experience Replay Code
#19IQN
4324
Score· 2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning Code
#20Prior+Duel hsSOTA
4038.4
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#21C51 noop
3965
Score· 2017-07-21
A Distributional Perspective on Reinforcement Learning Code
#22VPN
3811
Score· 2017-07-11
Value Prediction Network Code
#23Prior hs
3510
Score· 2015-11-18
Prioritized Experience Replay Code
#24DDQN+Pop-Art noop
3469.6
Score· 2016-02-24
Learning values across many orders of magnitude
#25Persistent AL
3248.96
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#26NoisyNet-Dueling
2923
Score· 2017-06-30
Noisy Networks for Exploration Code
#27Sarsa-φ-EB
2770.1
Score· 2017-06-25
Count-Based Exploration in Feature Space for Reinforcement Learning Code
#28MFEC
2394
Score· 2020-08-21
Model-Free Episodic Control with State Aggregation
#29Duel hs
2332.4
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#30Advantage Learning
2305.82
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#31Bootstrapped DQN
2181.4
Score· 2016-02-15
Deep Exploration via Bootstrapped DQN Code
#32DDQN (tuned) noop
1683.3
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#33DDQN (tuned) hs
1448.1
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#34Sarsa-ε
1394.3
Score· 2017-06-25
Count-Based Exploration in Feature Space for Reinforcement Learning Code
#35CURL
924
Score· 2020-04-08
CURL: Contrastive Unsupervised Representations for Reinforcement Learning Code
#36DQN noop
797.4
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#37CGP
782
Score· 2018-06-14
Evolving simple programs for playing Atari games Code
#38MP-EBSOTA
507
Score· 2015-07-03
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models Code
#39DQN hs
496.1
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#40Gorila
426.6
Score· 2015-07-15
Massively Parallel Methods for Deep Reinforcement Learning Code
#41ES FF (1 hour) noop
370
Score· 2017-03-10
Evolution Strategies as a Scalable Alternative to Reinforcement Learning Code
#42Nature DQN
328.3
Score
No paperCode
#43DNA
320
Score· 2022-06-20
DNA: Proximal Policy Optimization with a Dual Network Architecture Code
#44IMPALA (deep)
317.75
Score· 2018-02-05
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Code
#45POP3D
316.87
Score· 2018-07-02
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization Code
#46IDVQ + DRSC + XNES
300
Score· 2018-06-04
Playing Atari with Six Neurons Code
#47UCTSOTA
270.5
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#48Best Learner
216.9
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#49A3C LSTM hs
197.6
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#50A3C FF hs
190.5
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#51SARSA
180.9
Score
No paper
#52A3C FF (1 day) hs
180.1
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#53SAC
59.4
Score· 2019-10-16
Soft Actor-Critic for Discrete Action Settings Code