Atari Games on Atari 2600 Pong

Metric: Score (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	Score▼	Extra Data	Paper	Date↕	Code
1	Duel noop	21	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
2	ES FF (1 hour) noop	21	No	Evolution Strategies as a Scalable Alternative t...	2017-03-10	Code
3	IQN	21	No	Implicit Quantile Networks for Distributional Re...	2018-06-14	Code
4	MuZero	21	No	Mastering Atari, Go, Chess and Shogi by Planning...	2019-11-19	Code
5	R2D2	21	No	-	-	Code
6	NoisyNet-Dueling	21	No	Noisy Networks for Exploration	2017-06-30	Code
7	DQN Best	21	No	Playing Atari with Deep Reinforcement Learning	2013-12-19	Code
8	QR-DQN-1	21	No	Distributional Reinforcement Learning with Quant...	2017-10-27	Code
9	UCT	21	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
10	GDI-H3(200M frames)	21	No	Generalized Data Distribution Iteration	2022-06-07	-
11	GDI-I3(200M frames)	21	No	Generalized Data Distribution Iteration	2022-06-07	-
12	GDI-I3	21	No	Generalized Data Distribution Iteration	2022-06-07	-
13	GDI-H3	21	No	Generalized Data Distribution Iteration	2022-06-07	-
14	ASL DDQN	21	No	Train a Real-world Local Path Planner in One Hou...	2023-05-07	Code
15	IMPALA (deep)	20.98	No	IMPALA: Scalable Distributed Deep-RL with Import...	2018-02-05	Code
16	MuZero (Res2 Adam)	20.95	No	Online and Offline Reinforcement Learning by Pla...	2021-04-13	Code
17	DDQN (tuned) noop	20.9	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
18	Prior+Duel noop	20.9	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
19	C51 noop	20.9	No	A Distributional Perspective on Reinforcement Le...	2017-07-21	Code
20	Bootstrapped DQN	20.9	No	Deep Exploration via Bootstrapped DQN	2016-02-15	Code
21	Ape-X	20.9	No	Distributed Prioritized Experience Replay	2018-03-02	Code
22	A2C + SIL	20.9	No	Self-Imitation Learning	2018-06-14	Code
23	Agent57	20.67	No	Agent57: Outperforming the Atari Human Benchmark	2020-03-30	Code
24	Prior noop	20.6	No	Prioritized Experience Replay	2015-11-18	Code
25	DDQN+Pop-Art noop	20.6	No	Learning values across many orders of magnitude	2016-02-24	-
26	POP3D	20.5	No	Policy Optimization With Penalized Point Probabi...	2018-07-02	Code
27	Discrete Latent Space World Model (VQ-VAE)	20.2	No	Smaller World Models for Reinforcement Learning	2020-10-12	-
28	DDRL A3C	20	No	Distributed Deep Reinforcement Learning: Learn h...	2018-01-09	Code
29	CGP	20	No	Evolving simple programs for playing Atari games	2018-06-14	Code
30	DreamerV2	20	No	Mastering Atari with Discrete World Models	2020-10-05	Code
31	Persistent AL	19.76	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
32	DNA	19.7	No	DNA: Proximal Policy Optimization with a Dual Ne...	2022-06-20	Code
33	Advantage Learning	19.66	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
34	DQN noop	19.5	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
35	DDQN (tuned) hs	19.1	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
36	Nature DQN	18.9	No	-	-	Code
37	Prior hs	18.9	No	Prioritized Experience Replay	2015-11-18	Code
38	Duel hs	18.8	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
39	Prior+Duel hs	18.4	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
40	Recurrent Rational DQN Average	18.13	No	Adaptive Rational Activations to Boost Deep Rein...	2021-02-18	Code
41	Rational DQN Average	18.04	No	Adaptive Rational Activations to Boost Deep Rein...	2021-02-18	Code
42	DQN hs	18	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
43	DT	17.1	Yes	Decision Transformer: Reinforcement Learning via...	2021-06-02	Code
44	Gorila	16.7	No	Massively Parallel Methods for Deep Reinforcemen...	2015-07-15	Code
45	A3C FF (1 day) hs	11.4	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
46	A3C LSTM hs	10.7	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
47	MAC	10.6	No	Mean Actor Critic	2017-09-01	Code
48	A3C FF hs	5.6	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
49	CURL	2.1	No	CURL: Contrastive Unsupervised Representations f...	2020-04-08	Code
50	SARSA	-17.4	No	-	-	-
51	Best Learner	-19	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
52	SAC	-20.98	No	Soft Actor-Critic for Discrete Action Settings	2019-10-16	Code

#1Duel noop
21
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#2ES FF (1 hour) noop
21
Score· 2017-03-10
Evolution Strategies as a Scalable Alternative to Reinforcement Learning Code
#3IQN
21
Score· 2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning Code
#4MuZero
21
Score· 2019-11-19
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model Code
#5R2D2
21
Score
No paperCode
#6NoisyNet-Dueling
21
Score· 2017-06-30
Noisy Networks for Exploration Code
#7DQN Best
21
Score· 2013-12-19
Playing Atari with Deep Reinforcement Learning Code
#8QR-DQN-1
21
Score· 2017-10-27
Distributional Reinforcement Learning with Quantile Regression Code
#9UCTSOTA
21
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#10GDI-H3(200M frames)
21
Score· 2022-06-07
Generalized Data Distribution Iteration
#11GDI-I3(200M frames)
21
Score· 2022-06-07
Generalized Data Distribution Iteration
#12GDI-I3
21
Score· 2022-06-07
Generalized Data Distribution Iteration
#13GDI-H3
21
Score· 2022-06-07
Generalized Data Distribution Iteration
#14ASL DDQN
21
Score· 2023-05-07
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity Code
#15IMPALA (deep)
20.98
Score· 2018-02-05
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Code
#16MuZero (Res2 Adam)
20.95
Score· 2021-04-13
Online and Offline Reinforcement Learning by Planning with a Learned Model Code
#17DDQN (tuned) noop
20.9
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#18Prior+Duel noop
20.9
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#19C51 noop
20.9
Score· 2017-07-21
A Distributional Perspective on Reinforcement Learning Code
#20Bootstrapped DQN
20.9
Score· 2016-02-15
Deep Exploration via Bootstrapped DQN Code
#21Ape-X
20.9
Score· 2018-03-02
Distributed Prioritized Experience Replay Code
#22A2C + SIL
20.9
Score· 2018-06-14
Self-Imitation Learning Code
#23Agent57
20.67
Score· 2020-03-30
Agent57: Outperforming the Atari Human Benchmark Code
#24Prior noop
20.6
Score· 2015-11-18
Prioritized Experience Replay Code
#25DDQN+Pop-Art noop
20.6
Score· 2016-02-24
Learning values across many orders of magnitude
#26POP3D
20.5
Score· 2018-07-02
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization Code
#27Discrete Latent Space World Model (VQ-VAE)
20.2
Score· 2020-10-12
Smaller World Models for Reinforcement Learning
#28DDRL A3C
20
Score· 2018-01-09
Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes Code
#29CGP
20
Score· 2018-06-14
Evolving simple programs for playing Atari games Code
#30DreamerV2
20
Score· 2020-10-05
Mastering Atari with Discrete World Models Code
#31Persistent AL
19.76
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#32DNA
19.7
Score· 2022-06-20
DNA: Proximal Policy Optimization with a Dual Network Architecture Code
#33Advantage Learning
19.66
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#34DQN noop
19.5
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#35DDQN (tuned) hs
19.1
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#36Nature DQN
18.9
Score
No paperCode
#37Prior hs
18.9
Score· 2015-11-18
Prioritized Experience Replay Code
#38Duel hs
18.8
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#39Prior+Duel hs
18.4
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#40Recurrent Rational DQN Average
18.13
Score· 2021-02-18
Adaptive Rational Activations to Boost Deep Reinforcement Learning Code
#41Rational DQN Average
18.04
Score· 2021-02-18
Adaptive Rational Activations to Boost Deep Reinforcement Learning Code
#42DQN hs
18
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#43DT
17.1
Score· Extra Data· 2021-06-02
Decision Transformer: Reinforcement Learning via Sequence Modeling Code
#44Gorila
16.7
Score· 2015-07-15
Massively Parallel Methods for Deep Reinforcement Learning Code
#45A3C FF (1 day) hs
11.4
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#46A3C LSTM hs
10.7
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#47MAC
10.6
Score· 2017-09-01
Mean Actor Critic Code
#48A3C FF hs
5.6
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#49CURL
2.1
Score· 2020-04-08
CURL: Contrastive Unsupervised Representations for Reinforcement Learning Code
#50SARSA
-17.4
Score
No paper
#51Best Learner
-19
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#52SAC
-20.98
Score· 2019-10-16
Soft Actor-Critic for Discrete Action Settings Code