Atari Games on Atari 2600 Boxing

Metric: Score (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	Score▼	Extra Data	Paper	Date↕	Code
1	MuZero	100	No	Mastering Atari, Go, Chess and Shogi by Planning...	2019-11-19	Code
2	Ape-X	100	No	Distributed Prioritized Experience Replay	2018-03-02	Code
3	NoisyNet-Dueling	100	No	Noisy Networks for Exploration	2017-06-30	Code
4	UCT	100	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
5	Agent57	100	No	Agent57: Outperforming the Atari Human Benchmark	2020-03-30	Code
6	MuZero (Res2 Adam)	100	No	Online and Offline Reinforcement Learning by Pla...	2021-04-13	Code
7	GDI-H3	100	No	Generalized Data Distribution Iteration	2022-06-07	-
8	GDI-I3	100	No	Generalized Data Distribution Iteration	2022-06-07	-
9	GDI-H3	100	No	Generalized Data Distribution Iteration	2022-06-07	-
10	IMPALA (deep)	99.96	No	IMPALA: Scalable Distributed Deep-RL with Import...	2018-02-05	Code
11	QR-DQN-1	99.9	No	Distributional Reinforcement Learning with Quant...	2017-10-27	Code
12	DNA	99.9	No	DNA: Proximal Policy Optimization with a Dual Ne...	2022-06-20	Code
13	IQN	99.8	No	Implicit Quantile Networks for Distributional Re...	2018-06-14	Code
14	A2C + SIL	99.6	No	Self-Imitation Learning	2018-06-14	Code
15	ASL DDQN	99.6	No	Train a Real-world Local Path Planner in One Hou...	2023-05-07	Code
16	Duel noop	99.4	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
17	Reactor 500M	99.4	No	The Reactor: A fast and sample-efficient Actor-C...	2017-04-15	-
18	DDQN+Pop-Art noop	99.3	No	Learning values across many orders of magnitude	2016-02-24	-
19	Prior+Duel noop	98.9	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
20	R2D2	98.5	No	-	-	Code
21	DDRL A3C	98	No	Distributed Deep Reinforcement Learning: Learn h...	2018-01-09	Code
22	C51 noop	97.8	No	A Distributional Perspective on Reinforcement Le...	2017-07-21	Code
23	POP3D	97.23	No	Policy Optimization With Penalized Point Probabi...	2018-07-02	Code
24	Prior noop	95.6	No	Prioritized Experience Replay	2015-11-18	Code
25	Persistent AL	94.3	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
26	Advantage Learning	93.94	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
27	Bootstrapped DQN	93.2	No	Deep Exploration via Bootstrapped DQN	2016-02-15	Code
28	DreamerV2	92	No	Mastering Atari with Discrete World Models	2020-10-05	Code
29	DDQN (tuned) noop	91.6	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
30	DQN noop	88	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
31	Prior+Duel hs	79.2	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
32	Duel hs	77.3	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
33	Gorila	74.2	No	Massively Parallel Methods for Deep Reinforcemen...	2015-07-15	Code
34	DDQN (tuned) hs	73.5	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
35	Prior hs	72.3	No	Prioritized Experience Replay	2015-11-18	Code
36	Nature DQN	71.8	No	-	-	Code
37	DQN hs	70.3	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
38	A3C FF hs	59.8	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
39	ES FF (1 hour) noop	49.8	No	Evolution Strategies as a Scalable Alternative t...	2017-03-10	Code
40	Best Learner	44	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
41	CGP	38.4	No	Evolving simple programs for playing Atari games	2018-06-14	Code
42	A3C LSTM hs	37.3	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
43	A3C FF (1 day) hs	33.7	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
44	SARSA	9.8	No	-	-	-
45	CURL	4.8	No	CURL: Contrastive Unsupervised Representations f...	2020-04-08	Code

#1MuZero
100
Score· 2019-11-19
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model Code
#2Ape-X
100
Score· 2018-03-02
Distributed Prioritized Experience Replay Code
#3NoisyNet-Dueling
100
Score· 2017-06-30
Noisy Networks for Exploration Code
#4UCTSOTA
100
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#5Agent57
100
Score· 2020-03-30
Agent57: Outperforming the Atari Human Benchmark Code
#6MuZero (Res2 Adam)
100
Score· 2021-04-13
Online and Offline Reinforcement Learning by Planning with a Learned Model Code
#7GDI-H3
100
Score· 2022-06-07
Generalized Data Distribution Iteration
#8GDI-I3
100
Score· 2022-06-07
Generalized Data Distribution Iteration
#9GDI-H3
100
Score· 2022-06-07
Generalized Data Distribution Iteration
#10IMPALA (deep)
99.96
Score· 2018-02-05
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Code
#11QR-DQN-1
99.9
Score· 2017-10-27
Distributional Reinforcement Learning with Quantile Regression Code
#12DNA
99.9
Score· 2022-06-20
DNA: Proximal Policy Optimization with a Dual Network Architecture Code
#13IQN
99.8
Score· 2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning Code
#14A2C + SIL
99.6
Score· 2018-06-14
Self-Imitation Learning Code
#15ASL DDQN
99.6
Score· 2023-05-07
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity Code
#16Duel noop
99.4
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#17Reactor 500M
99.4
Score· 2017-04-15
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
#18DDQN+Pop-Art noop
99.3
Score· 2016-02-24
Learning values across many orders of magnitude
#19Prior+Duel noop
98.9
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#20R2D2
98.5
Score
No paperCode
#21DDRL A3C
98
Score· 2018-01-09
Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes Code
#22C51 noop
97.8
Score· 2017-07-21
A Distributional Perspective on Reinforcement Learning Code
#23POP3D
97.23
Score· 2018-07-02
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization Code
#24Prior noop
95.6
Score· 2015-11-18
Prioritized Experience Replay Code
#25Persistent AL
94.3
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#26Advantage Learning
93.94
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#27Bootstrapped DQN
93.2
Score· 2016-02-15
Deep Exploration via Bootstrapped DQN Code
#28DreamerV2
92
Score· 2020-10-05
Mastering Atari with Discrete World Models Code
#29DDQN (tuned) noop
91.6
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#30DQN noop
88
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#31Prior+Duel hs
79.2
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#32Duel hs
77.3
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#33Gorila
74.2
Score· 2015-07-15
Massively Parallel Methods for Deep Reinforcement Learning Code
#34DDQN (tuned) hs
73.5
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#35Prior hs
72.3
Score· 2015-11-18
Prioritized Experience Replay Code
#36Nature DQN
71.8
Score
No paperCode
#37DQN hs
70.3
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#38A3C FF hs
59.8
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#39ES FF (1 hour) noop
49.8
Score· 2017-03-10
Evolution Strategies as a Scalable Alternative to Reinforcement Learning Code
#40Best Learner
44
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#41CGP
38.4
Score· 2018-06-14
Evolving simple programs for playing Atari games Code
#42A3C LSTM hs
37.3
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#43A3C FF (1 day) hs
33.7
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#44SARSA
9.8
Score
No paper
#45CURL
4.8
Score· 2020-04-08
CURL: Contrastive Unsupervised Representations for Reinforcement Learning Code