TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Playing Games/Atari Games/Atari 2600 Battle Zone

Atari Games on Atari 2600 Battle Zone

Metric: Score (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Score▼Extra DataPaperDate↕Code
1Agent57934134.88NoAgent57: Outperforming the Atari Human Benchmark2020-03-30Code
2MuZero848623NoMastering Atari, Go, Chess and Shogi by Planning...2019-11-19Code
3GDI-H3824360NoGeneralized Data Distribution Iteration2022-06-07-
4R2D2751880No--Code
5GDI-I3478830NoGeneralized Data Distribution Iteration2022-06-07-
6MuZero (Res2 Adam)178716.9NoOnline and Offline Reinforcement Learning by Pla...2021-04-13Code
7Ape-X98895NoDistributed Prioritized Experience Replay2018-03-02Code
8FQF87928.6NoFully Parameterized Quantile Function for Distri...2019-11-05Code
9DNA71003NoDNA: Proximal Policy Optimization with a Dual Ne...2022-06-20Code
10UCT70333.3NoThe Arcade Learning Environment: An Evaluation P...2012-07-19Code
11Reactor 500M64070NoThe Reactor: A fast and sample-efficient Actor-C...2017-04-15-
12NoisyNet-Dueling52262NoNoisy Networks for Exploration2017-06-30Code
13IQN42244NoImplicit Quantile Networks for Distributional Re...2018-06-14Code
14DreamerV240325NoMastering Atari with Discrete World Models2020-10-05Code
15QR-DQN-139268NoDistributional Reinforcement Learning with Quant...2017-10-27Code
16ASL DDQN38986NoTrain a Real-world Local Path Planner in One Hou...2023-05-07Code
17Bootstrapped DQN38666.7NoDeep Exploration via Bootstrapped DQN2016-02-15Code
18Duel noop37150NoDueling Network Architectures for Deep Reinforce...2015-11-20Code
19Prior+Duel noop35520NoDueling Network Architectures for Deep Reinforce...2015-11-20Code
20Persistent AL34583.07NoIncreasing the Action Gap: New Operators for Rei...2015-12-15Code
21CGP34200NoEvolving simple programs for playing Atari games2018-06-14Code
22DDQN (tuned) noop31700NoDueling Network Architectures for Deep Reinforce...2015-11-20Code
23Prior noop31530NoPrioritized Experience Replay2015-11-18Code
24Duel hs31320NoDueling Network Architectures for Deep Reinforce...2015-11-20Code
25Prior+Duel hs30650NoDeep Reinforcement Learning with Double Q-learning2015-09-22Code
26DQN noop29900NoDeep Reinforcement Learning with Double Q-learning2015-09-22Code
27Advantage Learning28789.29NoIncreasing the Action Gap: New Operators for Rei...2015-12-15Code
28C51 noop28742NoA Distributional Perspective on Reinforcement Le...2017-07-21Code
29Nature DQN26300No--Code
30Recurrent Rational DQN Average25749NoAdaptive Rational Activations to Boost Deep Rein...2021-02-18Code
31Prior hs25520NoPrioritized Experience Replay2015-11-18Code
32A2C + SIL25075NoSelf-Imitation Learning2018-06-14Code
33DDQN (tuned) hs24740NoDeep Reinforcement Learning with Double Q-learning2015-09-22Code
34DQN hs23750NoDeep Reinforcement Learning with Double Q-learning2015-09-22Code
35Rational DQN Average23403NoAdaptive Rational Activations to Boost Deep Rein...2021-02-18Code
36IMPALA (deep)20885NoIMPALA: Scalable Distributed Deep-RL with Import...2018-02-05Code
37A3C LSTM hs20760NoAsynchronous Methods for Deep Reinforcement Lear...2016-02-04Code
38Gorila19938NoMassively Parallel Methods for Deep Reinforcemen...2015-07-15Code
39ES FF (1 hour) noop16600NoEvolution Strategies as a Scalable Alternative t...2017-03-10Code
40Best Learner15819.7NoThe Arcade Learning Environment: An Evaluation P...2012-07-19Code
41POP3D15466.67NoPolicy Optimization With Penalized Point Probabi...2018-07-02Code
42A3C FF hs12950NoAsynchronous Methods for Deep Reinforcement Lear...2016-02-04Code
43A3C FF (1 day) hs11340NoAsynchronous Methods for Deep Reinforcement Lear...2016-02-04Code
44CURL11208NoCURL: Contrastive Unsupervised Representations f...2020-04-08Code
45DDQN+Pop-Art noop8220NoLearning values across many orders of magnitude2016-02-24-
46SAC4386.7NoSoft Actor-Critic for Discrete Action Settings2019-10-16Code
47SARSA16.2No---