TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Distributed Deep Reinforcement Learning: Learn how to play...

Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes

Igor Adamski, Robert Adamski, Tomasz Grel, Adam Jędrych, Kamil Kaczmarek, Henryk Michalewski

2018-01-09Reinforcement LearningAtari GamesPlaying the Game of 2048reinforcement-learning
PaperPDFCode(official)

Abstract

We present a study in Distributed Deep Reinforcement Learning (DDRL) focused on scalability of a state-of-the-art Deep Reinforcement Learning algorithm known as Batch Asynchronous Advantage ActorCritic (BA3C). We show that using the Adam optimization algorithm with a batch size of up to 2048 is a viable choice for carrying out large scale machine learning computations. This, combined with careful reexamination of the optimizer's hyperparameters, using synchronous training on the node level (while keeping the local, single node part of the algorithm asynchronous) and minimizing the memory footprint of the model, allowed us to achieve linear scaling for up to 64 CPU nodes. This corresponds to a training time of 21 minutes on 768 CPU cores, as opposed to 10 hours when using a single node with 24 cores achieved by a baseline single-node implementation.

Results

TaskDatasetMetricValueModel
Atari GamesAtari 2600 BoxingScore98DDRL A3C
Atari GamesAtari 2600 PongScore20DDRL A3C
Atari GamesAtari 2600 BreakoutScore350DDRL A3C
Atari GamesAtari 2600 Space InvadersScore650DDRL A3C
Atari GamesAtari 2600 Beam RiderScore14900DDRL A3C
Atari GamesAtari 2600 SeaquestScore1832DDRL A3C
Video GamesAtari 2600 BoxingScore98DDRL A3C
Video GamesAtari 2600 PongScore20DDRL A3C
Video GamesAtari 2600 BreakoutScore350DDRL A3C
Video GamesAtari 2600 Space InvadersScore650DDRL A3C
Video GamesAtari 2600 Beam RiderScore14900DDRL A3C
Video GamesAtari 2600 SeaquestScore1832DDRL A3C

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17