Atari Games on Atari 2600 Q*Bert

Metric: Score (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	Score▼	Extra Data	Paper	Date↕	Code
1	Agent57	580328.14	No	Agent57: Outperforming the Atari Human Benchmark	2020-03-30	Code
2	QR-DQN-1	572510	No	Distributional Reinforcement Learning with Quant...	2017-10-27	Code
3	R2D2	408850	No	-	-	Code
4	IMPALA (deep)	351200.12	No	IMPALA: Scalable Distributed Deep-RL with Import...	2018-02-05	Code
5	Ape-X	302391.3	No	Distributed Prioritized Experience Replay	2018-03-02	Code
6	A2C + SIL	104975.6	No	Self-Imitation Learning	2018-06-14	Code
7	MuZero (Res2 Adam)	94906.25	No	Online and Offline Reinforcement Learning by Pla...	2021-04-13	Code
8	DreamerV2	94688	No	Mastering Atari with Discrete World Models	2020-10-05	Code
9	MuZero	72276	No	Mastering Atari, Go, Chess and Shogi by Planning...	2019-11-19	Code
10	DNA	52398	No	DNA: Proximal Policy Optimization with a Dual Ne...	2022-06-20	Code
11	GDI-H3(200M frames)	28657	No	Generalized Data Distribution Iteration	2022-06-07	-
12	GDI-H3	28657	No	Generalized Data Distribution Iteration	2022-06-07	-
13	GDI-I3	27800	No	GDI: Rethinking What Makes Reinforcement Learnin...	2021-06-11	-
14	GDI-I3	27800	No	GDI: Rethinking What Makes Reinforcement Learnin...	2021-06-11	-
15	NoisyNet-Dueling	27121	No	Noisy Networks for Exploration	2017-06-30	Code
16	IQN	25750	No	Implicit Quantile Networks for Distributional Re...	2018-06-14	Code
17	ASL DDQN	24548.8	No	Train a Real-world Local Path Planner in One Hou...	2023-05-07	Code
18	C51 noop	23784	No	A Distributional Perspective on Reinforcement Le...	2017-07-21	Code
19	A3C LSTM hs	21307.5	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
20	Duel noop	19220.3	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
21	Prior+Duel noop	18760.3	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
22	UCT	17343.4	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
23	Prior noop	16256.5	No	Prioritized Experience Replay	2015-11-18	Code
24	MP-EB	15805	No	Incentivizing Exploration In Reinforcement Learn...	2015-07-03	Code
25	POP3D	15396.67	No	Policy Optimization With Penalized Point Probabi...	2018-07-02	Code
26	A3C FF hs	15148.8	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
27	Bootstrapped DQN	15092.7	No	Deep Exploration via Bootstrapped DQN	2016-02-15	Code
28	DDQN (tuned) noop	15088.5	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
29	VPN	14517	No	Value Prediction Network	2017-07-11	Code
30	Rational DQN Average	14436	No	Adaptive Rational Activations to Boost Deep Rein...	2021-02-18	Code
31	Advantage Learning	14368.03	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
32	Duel hs	14175.8	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
33	MFEC	14135	No	Model-Free Episodic Control with State Aggregation	2020-08-21	-
34	Recurrent Rational DQN Average	14080	No	Adaptive Rational Activations to Boost Deep Rein...	2021-02-18	Code
35	Prior+Duel hs	14063	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
36	A3C FF (1 day) hs	13752.3	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
37	DQN noop	13117.3	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
38	DDQN (tuned) hs	11020.8	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
39	Nature DQN	10596	No	-	-	Code
40	Prior hs	9944	No	Prioritized Experience Replay	2015-11-18	Code
41	DQN hs	9271.5	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
42	Gorila	7089.8	No	Massively Parallel Methods for Deep Reinforcemen...	2015-07-15	Code
43	DDQN+Pop-Art noop	5236.8	No	Learning values across many orders of magnitude	2016-02-24	-
44	DQN Best	4500	No	Playing Atari with Deep Reinforcement Learning	2013-12-19	Code
45	Qbert Rainbow+SEER	4123.5	No	Improving Computational Efficiency in Visual Rei...	2021-03-04	Code
46	Sarsa-φ-EB	4111.8	No	Count-Based Exploration in Feature Space for Rei...	2017-06-25	Code
47	Sarsa-ε	3895.3	No	Count-Based Exploration in Feature Space for Rei...	2017-06-25	Code
48	IDVQ + DRSC + XNES	1250	No	Playing Atari with Six Neurons	2018-06-04	Code
49	CURL	1225.6	No	CURL: Contrastive Unsupervised Representations f...	2020-04-08	Code
50	SARSA	960.3	No	-	-	-
51	CGP	770	No	Evolving simple programs for playing Atari games	2018-06-14	Code
52	Best Learner	613.5	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
53	SAC	280.5	No	Soft Actor-Critic for Discrete Action Settings	2019-10-16	Code
54	MAC	243.4	No	Mean Actor Critic	2017-09-01	Code
55	ES FF (1 hour) noop	147.5	No	Evolution Strategies as a Scalable Alternative t...	2017-03-10	Code
56	DT	25.1	No	Decision Transformer: Reinforcement Learning via...	2021-06-02	Code

#1Agent57SOTA
580328.14
Score· 2020-03-30
Agent57: Outperforming the Atari Human Benchmark Code
#2QR-DQN-1SOTA
572510
Score· 2017-10-27
Distributional Reinforcement Learning with Quantile Regression Code
#3R2D2
408850
Score
No paperCode
#4IMPALA (deep)
351200.12
Score· 2018-02-05
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Code
#5Ape-X
302391.3
Score· 2018-03-02
Distributed Prioritized Experience Replay Code
#6A2C + SIL
104975.6
Score· 2018-06-14
Self-Imitation Learning Code
#7MuZero (Res2 Adam)
94906.25
Score· 2021-04-13
Online and Offline Reinforcement Learning by Planning with a Learned Model Code
#8DreamerV2
94688
Score· 2020-10-05
Mastering Atari with Discrete World Models Code
#9MuZero
72276
Score· 2019-11-19
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model Code
#10DNA
52398
Score· 2022-06-20
DNA: Proximal Policy Optimization with a Dual Network Architecture Code
#11GDI-H3(200M frames)
28657
Score· 2022-06-07
Generalized Data Distribution Iteration
#12GDI-H3
28657
Score· 2022-06-07
Generalized Data Distribution Iteration
#13GDI-I3
27800
Score· 2021-06-11
GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning
#14GDI-I3
27800
Score· 2021-06-11
GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning
#15NoisyNet-DuelingSOTA
27121
Score· 2017-06-30
Noisy Networks for Exploration Code
#16IQN
25750
Score· 2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning Code
#17ASL DDQN
24548.8
Score· 2023-05-07
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity Code
#18C51 noop
23784
Score· 2017-07-21
A Distributional Perspective on Reinforcement Learning Code
#19A3C LSTM hsSOTA
21307.5
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#20Duel noopSOTA
19220.3
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#21Prior+Duel noop
18760.3
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#22UCTSOTA
17343.4
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#23Prior noop
16256.5
Score· 2015-11-18
Prioritized Experience Replay Code
#24MP-EB
15805
Score· 2015-07-03
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models Code
#25POP3D
15396.67
Score· 2018-07-02
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization Code
#26A3C FF hs
15148.8
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#27Bootstrapped DQN
15092.7
Score· 2016-02-15
Deep Exploration via Bootstrapped DQN Code
#28DDQN (tuned) noop
15088.5
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#29VPN
14517
Score· 2017-07-11
Value Prediction Network Code
#30Rational DQN Average
14436
Score· 2021-02-18
Adaptive Rational Activations to Boost Deep Reinforcement Learning Code
#31Advantage Learning
14368.03
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#32Duel hs
14175.8
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#33MFEC
14135
Score· 2020-08-21
Model-Free Episodic Control with State Aggregation
#34Recurrent Rational DQN Average
14080
Score· 2021-02-18
Adaptive Rational Activations to Boost Deep Reinforcement Learning Code
#35Prior+Duel hs
14063
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#36A3C FF (1 day) hs
13752.3
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#37DQN noop
13117.3
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#38DDQN (tuned) hs
11020.8
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#39Nature DQN
10596
Score
No paperCode
#40Prior hs
9944
Score· 2015-11-18
Prioritized Experience Replay Code
#41DQN hs
9271.5
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#42Gorila
7089.8
Score· 2015-07-15
Massively Parallel Methods for Deep Reinforcement Learning Code
#43DDQN+Pop-Art noop
5236.8
Score· 2016-02-24
Learning values across many orders of magnitude
#44DQN Best
4500
Score· 2013-12-19
Playing Atari with Deep Reinforcement Learning Code
#45Qbert Rainbow+SEER
4123.5
Score· 2021-03-04
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings Code
#46Sarsa-φ-EB
4111.8
Score· 2017-06-25
Count-Based Exploration in Feature Space for Reinforcement Learning Code
#47Sarsa-ε
3895.3
Score· 2017-06-25
Count-Based Exploration in Feature Space for Reinforcement Learning Code
#48IDVQ + DRSC + XNES
1250
Score· 2018-06-04
Playing Atari with Six Neurons Code
#49CURL
1225.6
Score· 2020-04-08
CURL: Contrastive Unsupervised Representations for Reinforcement Learning Code
#50SARSA
960.3
Score
No paper
#51CGP
770
Score· 2018-06-14
Evolving simple programs for playing Atari games Code
#52Best Learner
613.5
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#53SAC
280.5
Score· 2019-10-16
Soft Actor-Critic for Discrete Action Settings Code
#54MAC
243.4
Score· 2017-09-01
Mean Actor Critic Code
#55ES FF (1 hour) noop
147.5
Score· 2017-03-10
Evolution Strategies as a Scalable Alternative to Reinforcement Learning Code
#56DT
25.1
Score· 2021-06-02
Decision Transformer: Reinforcement Learning via Sequence Modeling Code