Atari Games on Atari 2600 Seaquest

Metric: Score (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	Score▼	Extra Data	Paper	Date↕	Code
1	GDI-H3(200M frames)	1000000	No	Generalized Data Distribution Iteration	2022-06-07	-
2	GDI-H3	1000000	No	Generalized Data Distribution Iteration	2022-06-07	-
3	Agent57	999997.63	No	Agent57: Outperforming the Atari Human Benchmark	2020-03-30	Code
4	R2D2	999996.7	No	-	-	Code
5	MuZero	999976.52	No	Mastering Atari, Go, Chess and Shogi by Planning...	2019-11-19	Code
6	MuZero (Res2 Adam)	999659.18	No	Online and Offline Reinforcement Learning by Pla...	2021-04-13	Code
7	GDI-I3	943910	No	GDI: Rethinking What Makes Reinforcement Learnin...	2021-06-11	-
8	GDI-I3	943910	No	GDI: Rethinking What Makes Reinforcement Learnin...	2021-06-11	-
9	Ape-X	392952.3	No	Distributed Prioritized Experience Replay	2018-03-02	Code
10	C51 noop	266434	No	A Distributional Perspective on Reinforcement Le...	2017-07-21	Code
11	Duel noop	50254.2	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
12	Duel hs	37361.6	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
13	IQN	30140	No	Implicit Quantile Networks for Distributional Re...	2018-06-14	Code
14	ASL DDQN	29278.6	No	Train a Real-world Local Path Planner in One Hou...	2023-05-07	Code
15	Prior noop	26357.8	No	Prioritized Experience Replay	2015-11-18	Code
16	Prior hs	25463.7	No	Prioritized Experience Replay	2015-11-18	Code
17	NoisyNet-Dueling	16754	No	Noisy Networks for Exploration	2017-06-30	Code
18	DDQN (tuned) noop	16452.7	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
19	DDQN (tuned) hs	14498	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
20	Persistent AL	13230.74	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
21	DDQN+Pop-Art noop	10932.3	No	Learning values across many orders of magnitude	2016-02-24	-
22	Gorila	10145.9	No	Massively Parallel Methods for Deep Reinforcemen...	2015-07-15	Code
23	Bootstrapped DQN	9083.1	No	Deep Exploration via Bootstrapped DQN	2016-02-15	Code
24	Advantage Learning	8670.5	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
25	QR-DQN-1	8268	No	Distributional Reinforcement Learning with Quant...	2017-10-27	Code
26	DreamerV2	7480	No	Mastering Atari with Discrete World Models	2020-10-05	Code
27	Recurrent Rational DQN Average	7460	No	Adaptive Rational Activations to Boost Deep Rein...	2021-02-18	Code
28	DARQN soft	7263	No	Deep Attention Recurrent Q-Network	2015-12-05	Code
29	Rational DQN Average	6603	No	Adaptive Rational Activations to Boost Deep Rein...	2021-02-18	Code
30	DQN noop	5860.6	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
31	VPN	5628	No	Value Prediction Network	2017-07-11	Code
32	Nature DQN	5286	No	-	-	Code
33	UCT	5132.4	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
34	DQN hs	4216.7	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
35	DNA	4146	No	DNA: Proximal Policy Optimization with a Dual Ne...	2022-06-20	Code
36	A2C + SIL	2456.5	No	Self-Imitation Learning	2018-06-14	Code
37	A3C FF hs	2355.4	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
38	A3C FF (1 day) hs	2300.2	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
39	DDRL A3C	1832	No	Distributed Deep Reinforcement Learning: Learn h...	2018-01-09	Code
40	POP3D	1807.47	No	Policy Optimization With Penalized Point Probabi...	2018-07-02	Code
41	IMPALA (deep)	1753.2	No	IMPALA: Scalable Distributed Deep-RL with Import...	2018-02-05	Code
42	DQN Best	1740	No	Playing Atari with Deep Reinforcement Learning	2013-12-19	Code
43	MAC	1703.4	No	Mean Actor Critic	2017-09-01	Code
44	Prior+Duel hs	1431.2	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
45	ES FF (1 hour) noop	1390	No	Evolution Strategies as a Scalable Alternative t...	2017-03-10	Code
46	A3C LSTM hs	1326.1	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
47	Prior+Duel noop	931.6	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
48	CGP	724	No	Evolving simple programs for playing Atari games	2018-06-14	Code
49	SARSA	675.5	No	-	-	-
50	Best Learner	664.8	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
51	Discrete Latent Space World Model (VQ-VAE)	635	No	Smaller World Models for Reinforcement Learning	2020-10-12	-
52	Rainbow+SEER	561.2	No	Improving Computational Efficiency in Visual Rei...	2021-03-04	Code
53	CURL	408	No	CURL: Contrastive Unsupervised Representations f...	2020-04-08	Code
54	IDVQ + DRSC + XNES	320	No	Playing Atari with Six Neurons	2018-06-04	Code
55	SAC	211.6	No	Soft Actor-Critic for Discrete Action Settings	2019-10-16	Code
56	DT	2.4	No	Decision Transformer: Reinforcement Learning via...	2021-06-02	Code

#1GDI-H3(200M frames)SOTA
1000000
Score· 2022-06-07
Generalized Data Distribution Iteration
#2GDI-H3
1000000
Score· 2022-06-07
Generalized Data Distribution Iteration
#3Agent57SOTA
999997.63
Score· 2020-03-30
Agent57: Outperforming the Atari Human Benchmark Code
#4R2D2
999996.7
Score
No paperCode
#5MuZeroSOTA
999976.52
Score· 2019-11-19
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model Code
#6MuZero (Res2 Adam)
999659.18
Score· 2021-04-13
Online and Offline Reinforcement Learning by Planning with a Learned Model Code
#7GDI-I3
943910
Score· 2021-06-11
GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning
#8GDI-I3
943910
Score· 2021-06-11
GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning
#9Ape-XSOTA
392952.3
Score· 2018-03-02
Distributed Prioritized Experience Replay Code
#10C51 noopSOTA
266434
Score· 2017-07-21
A Distributional Perspective on Reinforcement Learning Code
#11Duel noopSOTA
50254.2
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#12Duel hs
37361.6
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#13IQN
30140
Score· 2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning Code
#14ASL DDQN
29278.6
Score· 2023-05-07
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity Code
#15Prior noopSOTA
26357.8
Score· 2015-11-18
Prioritized Experience Replay Code
#16Prior hs
25463.7
Score· 2015-11-18
Prioritized Experience Replay Code
#17NoisyNet-Dueling
16754
Score· 2017-06-30
Noisy Networks for Exploration Code
#18DDQN (tuned) noop
16452.7
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#19DDQN (tuned) hsSOTA
14498
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#20Persistent AL
13230.74
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#21DDQN+Pop-Art noop
10932.3
Score· 2016-02-24
Learning values across many orders of magnitude
#22GorilaSOTA
10145.9
Score· 2015-07-15
Massively Parallel Methods for Deep Reinforcement Learning Code
#23Bootstrapped DQN
9083.1
Score· 2016-02-15
Deep Exploration via Bootstrapped DQN Code
#24Advantage Learning
8670.5
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#25QR-DQN-1
8268
Score· 2017-10-27
Distributional Reinforcement Learning with Quantile Regression Code
#26DreamerV2
7480
Score· 2020-10-05
Mastering Atari with Discrete World Models Code
#27Recurrent Rational DQN Average
7460
Score· 2021-02-18
Adaptive Rational Activations to Boost Deep Reinforcement Learning Code
#28DARQN soft
7263
Score· 2015-12-05
Deep Attention Recurrent Q-Network Code
#29Rational DQN Average
6603
Score· 2021-02-18
Adaptive Rational Activations to Boost Deep Reinforcement Learning Code
#30DQN noop
5860.6
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#31VPN
5628
Score· 2017-07-11
Value Prediction Network Code
#32Nature DQN
5286
Score
No paperCode
#33UCTSOTA
5132.4
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#34DQN hs
4216.7
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#35DNA
4146
Score· 2022-06-20
DNA: Proximal Policy Optimization with a Dual Network Architecture Code
#36A2C + SIL
2456.5
Score· 2018-06-14
Self-Imitation Learning Code
#37A3C FF hs
2355.4
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#38A3C FF (1 day) hs
2300.2
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#39DDRL A3C
1832
Score· 2018-01-09
Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes Code
#40POP3D
1807.47
Score· 2018-07-02
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization Code
#41IMPALA (deep)
1753.2
Score· 2018-02-05
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Code
#42DQN Best
1740
Score· 2013-12-19
Playing Atari with Deep Reinforcement Learning Code
#43MAC
1703.4
Score· 2017-09-01
Mean Actor Critic Code
#44Prior+Duel hs
1431.2
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#45ES FF (1 hour) noop
1390
Score· 2017-03-10
Evolution Strategies as a Scalable Alternative to Reinforcement Learning Code
#46A3C LSTM hs
1326.1
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#47Prior+Duel noop
931.6
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#48CGP
724
Score· 2018-06-14
Evolving simple programs for playing Atari games Code
#49SARSA
675.5
Score
No paper
#50Best Learner
664.8
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#51Discrete Latent Space World Model (VQ-VAE)
635
Score· 2020-10-12
Smaller World Models for Reinforcement Learning
#52Rainbow+SEER
561.2
Score· 2021-03-04
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings Code
#53CURL
408
Score· 2020-04-08
CURL: Contrastive Unsupervised Representations for Reinforcement Learning Code
#54IDVQ + DRSC + XNES
320
Score· 2018-06-04
Playing Atari with Six Neurons Code
#55SAC
211.6
Score· 2019-10-16
Soft Actor-Critic for Discrete Action Settings Code
#56DT
2.4
Score· 2021-06-02
Decision Transformer: Reinforcement Learning via Sequence Modeling Code