Video Games on Atari 2600 Battle Zone

Metric: Score (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	Score▼	Extra Data	Paper	Date↕	Code
1	Agent57	934134.88	No	Agent57: Outperforming the Atari Human Benchmark	2020-03-30	Code
2	MuZero	848623	No	Mastering Atari, Go, Chess and Shogi by Planning...	2019-11-19	Code
3	GDI-H3	824360	No	Generalized Data Distribution Iteration	2022-06-07	-
4	R2D2	751880	No	-	-	Code
5	GDI-I3	478830	No	Generalized Data Distribution Iteration	2022-06-07	-
6	MuZero (Res2 Adam)	178716.9	No	Online and Offline Reinforcement Learning by Pla...	2021-04-13	Code
7	Ape-X	98895	No	Distributed Prioritized Experience Replay	2018-03-02	Code
8	FQF	87928.6	No	Fully Parameterized Quantile Function for Distri...	2019-11-05	Code
9	DNA	71003	No	DNA: Proximal Policy Optimization with a Dual Ne...	2022-06-20	Code
10	UCT	70333.3	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
11	Reactor 500M	64070	No	The Reactor: A fast and sample-efficient Actor-C...	2017-04-15	-
12	NoisyNet-Dueling	52262	No	Noisy Networks for Exploration	2017-06-30	Code
13	IQN	42244	No	Implicit Quantile Networks for Distributional Re...	2018-06-14	Code
14	DreamerV2	40325	No	Mastering Atari with Discrete World Models	2020-10-05	Code
15	QR-DQN-1	39268	No	Distributional Reinforcement Learning with Quant...	2017-10-27	Code
16	ASL DDQN	38986	No	Train a Real-world Local Path Planner in One Hou...	2023-05-07	Code
17	Bootstrapped DQN	38666.7	No	Deep Exploration via Bootstrapped DQN	2016-02-15	Code
18	Duel noop	37150	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
19	Prior+Duel noop	35520	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
20	Persistent AL	34583.07	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
21	CGP	34200	No	Evolving simple programs for playing Atari games	2018-06-14	Code
22	DDQN (tuned) noop	31700	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
23	Prior noop	31530	No	Prioritized Experience Replay	2015-11-18	Code
24	Duel hs	31320	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
25	Prior+Duel hs	30650	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
26	DQN noop	29900	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
27	Advantage Learning	28789.29	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
28	C51 noop	28742	No	A Distributional Perspective on Reinforcement Le...	2017-07-21	Code
29	Nature DQN	26300	No	-	-	Code
30	Recurrent Rational DQN Average	25749	No	Adaptive Rational Activations to Boost Deep Rein...	2021-02-18	Code
31	Prior hs	25520	No	Prioritized Experience Replay	2015-11-18	Code
32	A2C + SIL	25075	No	Self-Imitation Learning	2018-06-14	Code
33	DDQN (tuned) hs	24740	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
34	DQN hs	23750	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
35	Rational DQN Average	23403	No	Adaptive Rational Activations to Boost Deep Rein...	2021-02-18	Code
36	IMPALA (deep)	20885	No	IMPALA: Scalable Distributed Deep-RL with Import...	2018-02-05	Code
37	A3C LSTM hs	20760	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
38	Gorila	19938	No	Massively Parallel Methods for Deep Reinforcemen...	2015-07-15	Code
39	ES FF (1 hour) noop	16600	No	Evolution Strategies as a Scalable Alternative t...	2017-03-10	Code
40	Best Learner	15819.7	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
41	POP3D	15466.67	No	Policy Optimization With Penalized Point Probabi...	2018-07-02	Code
42	A3C FF hs	12950	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
43	A3C FF (1 day) hs	11340	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
44	CURL	11208	No	CURL: Contrastive Unsupervised Representations f...	2020-04-08	Code
45	DDQN+Pop-Art noop	8220	No	Learning values across many orders of magnitude	2016-02-24	-
46	SAC	4386.7	No	Soft Actor-Critic for Discrete Action Settings	2019-10-16	Code
47	SARSA	16.2	No	-	-	-

#1Agent57SOTA
934134.88
Score· 2020-03-30
Agent57: Outperforming the Atari Human Benchmark Code
#2MuZeroSOTA
848623
Score· 2019-11-19
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model Code
#3GDI-H3
824360
Score· 2022-06-07
Generalized Data Distribution Iteration
#4R2D2
751880
Score
No paperCode
#5GDI-I3
478830
Score· 2022-06-07
Generalized Data Distribution Iteration
#6MuZero (Res2 Adam)
178716.9
Score· 2021-04-13
Online and Offline Reinforcement Learning by Planning with a Learned Model Code
#7Ape-XSOTA
98895
Score· 2018-03-02
Distributed Prioritized Experience Replay Code
#8FQF
87928.6
Score· 2019-11-05
Fully Parameterized Quantile Function for Distributional Reinforcement Learning Code
#9DNA
71003
Score· 2022-06-20
DNA: Proximal Policy Optimization with a Dual Network Architecture Code
#10UCTSOTA
70333.3
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#11Reactor 500M
64070
Score· 2017-04-15
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
#12NoisyNet-Dueling
52262
Score· 2017-06-30
Noisy Networks for Exploration Code
#13IQN
42244
Score· 2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning Code
#14DreamerV2
40325
Score· 2020-10-05
Mastering Atari with Discrete World Models Code
#15QR-DQN-1
39268
Score· 2017-10-27
Distributional Reinforcement Learning with Quantile Regression Code
#16ASL DDQN
38986
Score· 2023-05-07
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity Code
#17Bootstrapped DQN
38666.7
Score· 2016-02-15
Deep Exploration via Bootstrapped DQN Code
#18Duel noop
37150
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#19Prior+Duel noop
35520
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#20Persistent AL
34583.07
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#21CGP
34200
Score· 2018-06-14
Evolving simple programs for playing Atari games Code
#22DDQN (tuned) noop
31700
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#23Prior noop
31530
Score· 2015-11-18
Prioritized Experience Replay Code
#24Duel hs
31320
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#25Prior+Duel hs
30650
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#26DQN noop
29900
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#27Advantage Learning
28789.29
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#28C51 noop
28742
Score· 2017-07-21
A Distributional Perspective on Reinforcement Learning Code
#29Nature DQN
26300
Score
No paperCode
#30Recurrent Rational DQN Average
25749
Score· 2021-02-18
Adaptive Rational Activations to Boost Deep Reinforcement Learning Code
#31Prior hs
25520
Score· 2015-11-18
Prioritized Experience Replay Code
#32A2C + SIL
25075
Score· 2018-06-14
Self-Imitation Learning Code
#33DDQN (tuned) hs
24740
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#34DQN hs
23750
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#35Rational DQN Average
23403
Score· 2021-02-18
Adaptive Rational Activations to Boost Deep Reinforcement Learning Code
#36IMPALA (deep)
20885
Score· 2018-02-05
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Code
#37A3C LSTM hs
20760
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#38Gorila
19938
Score· 2015-07-15
Massively Parallel Methods for Deep Reinforcement Learning Code
#39ES FF (1 hour) noop
16600
Score· 2017-03-10
Evolution Strategies as a Scalable Alternative to Reinforcement Learning Code
#40Best Learner
15819.7
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#41POP3D
15466.67
Score· 2018-07-02
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization Code
#42A3C FF hs
12950
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#43A3C FF (1 day) hs
11340
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#44CURL
11208
Score· 2020-04-08
CURL: Contrastive Unsupervised Representations for Reinforcement Learning Code
#45DDQN+Pop-Art noop
8220
Score· 2016-02-24
Learning values across many orders of magnitude
#46SAC
4386.7
Score· 2019-10-16
Soft Actor-Critic for Discrete Action Settings Code
#47SARSA
16.2
Score
No paper