Video Games on Atari 2600 Ms. Pacman

Metric: Score (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	Score▼	Extra Data	Paper	Date↕	Code
1	MuZero	243401.1	No	Mastering Atari, Go, Chess and Shogi by Planning...	2019-11-19	Code
2	MuZero (Res2 Adam)	70659.76	No	Online and Offline Reinforcement Learning by Pla...	2021-04-13	Code
3	Agent57	63994.44	No	Agent57: Outperforming the Atari Human Benchmark	2020-03-30	Code
4	R2D2	42281.7	No	-	-	Code
5	UCT	22336	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
6	GDI-H3	11573	No	Generalized Data Distribution Iteration	2022-06-07	-
7	GDI-I3	11536	No	Generalized Data Distribution Iteration	2022-06-07	-
8	GDI-I3	11536	No	Generalized Data Distribution Iteration	2022-06-07	-
9	Ape-X	11255.2	No	Distributed Prioritized Experience Replay	2018-03-02	Code
10	MFEC	8530.4004	No	Model-Free Episodic Control with State Aggregation	2020-08-21	-
11	FQF	7631.9	No	Fully Parameterized Quantile Function for Distri...	2019-11-05	Code
12	IMPALA (deep)	7342.32	No	IMPALA: Scalable Distributed Deep-RL with Import...	2018-02-05	Code
13	Prior noop	6518.7	No	Prioritized Experience Replay	2015-11-18	Code
14	IQN	6349	No	Implicit Quantile Networks for Distributional Re...	2018-06-14	Code
15	Duel noop	6283.5	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
16	DNA	5894	No	DNA: Proximal Policy Optimization with a Dual Ne...	2022-06-20	Code
17	QR-DQN-1	5821	No	Distributional Reinforcement Learning with Quant...	2017-10-27	Code
18	DreamerV2	5652	No	Mastering Atari with Discrete World Models	2020-10-05	Code
19	NoisyNet-Dueling	5546	No	Noisy Networks for Exploration	2017-06-30	Code
20	DDQN+Pop-Art noop	4963.8	No	Learning values across many orders of magnitude	2016-02-24	-
21	ASL DDQN	4416	No	Train a Real-world Local Path Planner in One Hou...	2023-05-07	Code
22	Advantage Learning	4065.8	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
23	A2C + SIL	4025.1	No	Self-Imitation Learning	2018-06-14	Code
24	Persistent AL	3917.55	No	Increasing the Action Gap: New Operators for Rei...	2015-12-15	Code
25	C51 noop	3415	No	A Distributional Perspective on Reinforcement Le...	2017-07-21	Code
26	Prior+Duel noop	3327.3	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
27	DQN noop	3085.6	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
28	Bootstrapped DQN	2983.3	No	Deep Exploration via Bootstrapped DQN	2016-02-15	Code
29	DDQN (tuned) noop	2711.4	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
30	VPN	2689	No	Value Prediction Network	2017-07-11	Code
31	Rainbow	2570.2	No	Rainbow: Combining Improvements in Deep Reinforc...	2017-10-06	Code
32	CGP	2568	No	Evolving simple programs for playing Atari games	2018-06-14	Code
33	Nature DQN	2311	No	-	-	Code
34	Duel hs	2250.6	No	Dueling Network Architectures for Deep Reinforce...	2015-11-20	Code
35	Prior hs	1865.9	No	Prioritized Experience Replay	2015-11-18	Code
36	Best Learner	1691.8	No	The Arcade Learning Environment: An Evaluation P...	2012-07-19	Code
37	POP3D	1683.87	No	Policy Optimization With Penalized Point Probabi...	2018-07-02	Code
38	CURL	1492.8	No	CURL: Contrastive Unsupervised Representations f...	2020-04-08	Code
39	Gorila	1263	No	Massively Parallel Methods for Deep Reinforcemen...	2015-07-15	Code
40	DDQN (tuned) hs	1241.3	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
41	SARSA	1227	No	-	-	-
42	DQN hs	1092.3	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
43	Prior+Duel hs	1007.8	No	Deep Reinforcement Learning with Double Q-learning	2015-09-22	Code
44	A3C LSTM hs	850.7	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
45	SAC	690.9	No	Soft Actor-Critic for Discrete Action Settings	2019-10-16	Code
46	A3C FF hs	653.7	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code
47	A3C FF (1 day) hs	594.4	No	Asynchronous Methods for Deep Reinforcement Lear...	2016-02-04	Code

#1MuZeroSOTA
243401.1
Score· 2019-11-19
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model Code
#2MuZero (Res2 Adam)
70659.76
Score· 2021-04-13
Online and Offline Reinforcement Learning by Planning with a Learned Model Code
#3Agent57
63994.44
Score· 2020-03-30
Agent57: Outperforming the Atari Human Benchmark Code
#4R2D2
42281.7
Score
No paperCode
#5UCTSOTA
22336
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#6GDI-H3
11573
Score· 2022-06-07
Generalized Data Distribution Iteration
#7GDI-I3
11536
Score· 2022-06-07
Generalized Data Distribution Iteration
#8GDI-I3
11536
Score· 2022-06-07
Generalized Data Distribution Iteration
#9Ape-X
11255.2
Score· 2018-03-02
Distributed Prioritized Experience Replay Code
#10MFEC
8530.4004
Score· 2020-08-21
Model-Free Episodic Control with State Aggregation
#11FQF
7631.9
Score· 2019-11-05
Fully Parameterized Quantile Function for Distributional Reinforcement Learning Code
#12IMPALA (deep)
7342.32
Score· 2018-02-05
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Code
#13Prior noop
6518.7
Score· 2015-11-18
Prioritized Experience Replay Code
#14IQN
6349
Score· 2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning Code
#15Duel noop
6283.5
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#16DNA
5894
Score· 2022-06-20
DNA: Proximal Policy Optimization with a Dual Network Architecture Code
#17QR-DQN-1
5821
Score· 2017-10-27
Distributional Reinforcement Learning with Quantile Regression Code
#18DreamerV2
5652
Score· 2020-10-05
Mastering Atari with Discrete World Models Code
#19NoisyNet-Dueling
5546
Score· 2017-06-30
Noisy Networks for Exploration Code
#20DDQN+Pop-Art noop
4963.8
Score· 2016-02-24
Learning values across many orders of magnitude
#21ASL DDQN
4416
Score· 2023-05-07
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity Code
#22Advantage Learning
4065.8
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#23A2C + SIL
4025.1
Score· 2018-06-14
Self-Imitation Learning Code
#24Persistent AL
3917.55
Score· 2015-12-15
Increasing the Action Gap: New Operators for Reinforcement Learning Code
#25C51 noop
3415
Score· 2017-07-21
A Distributional Perspective on Reinforcement Learning Code
#26Prior+Duel noop
3327.3
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#27DQN noop
3085.6
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#28Bootstrapped DQN
2983.3
Score· 2016-02-15
Deep Exploration via Bootstrapped DQN Code
#29DDQN (tuned) noop
2711.4
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#30VPN
2689
Score· 2017-07-11
Value Prediction Network Code
#31Rainbow
2570.2
Score· 2017-10-06
Rainbow: Combining Improvements in Deep Reinforcement Learning Code
#32CGP
2568
Score· 2018-06-14
Evolving simple programs for playing Atari games Code
#33Nature DQN
2311
Score
No paperCode
#34Duel hs
2250.6
Score· 2015-11-20
Dueling Network Architectures for Deep Reinforcement Learning Code
#35Prior hs
1865.9
Score· 2015-11-18
Prioritized Experience Replay Code
#36Best Learner
1691.8
Score· 2012-07-19
The Arcade Learning Environment: An Evaluation Platform for General Agents Code
#37POP3D
1683.87
Score· 2018-07-02
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization Code
#38CURL
1492.8
Score· 2020-04-08
CURL: Contrastive Unsupervised Representations for Reinforcement Learning Code
#39Gorila
1263
Score· 2015-07-15
Massively Parallel Methods for Deep Reinforcement Learning Code
#40DDQN (tuned) hs
1241.3
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#41SARSA
1227
Score
No paper
#42DQN hs
1092.3
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#43Prior+Duel hs
1007.8
Score· 2015-09-22
Deep Reinforcement Learning with Double Q-learning Code
#44A3C LSTM hs
850.7
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#45SAC
690.9
Score· 2019-10-16
Soft Actor-Critic for Discrete Action Settings Code
#46A3C FF hs
653.7
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code
#47A3C FF (1 day) hs
594.4
Score· 2016-02-04
Asynchronous Methods for Deep Reinforcement Learning Code