TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Smaller World Models for Reinforcement Learning

Smaller World Models for Reinforcement Learning

Jan Robine, Tobias Uelwer, Stefan Harmeling

2020-10-12Reinforcement LearningAtari Gamesreinforcement-learning
PaperPDF

Abstract

Sample efficiency remains a fundamental issue of reinforcement learning. Model-based algorithms try to make better use of data by simulating the environment with a model. We propose a new neural network architecture for world models based on a vector quantized-variational autoencoder (VQ-VAE) to encode observations and a convolutional LSTM to predict the next embedding indices. A model-free PPO agent is trained purely on simulated experience from the world model. We adopt the setup introduced by Kaiser et al. (2020), which only allows 100K interactions with the real environment. We apply our method on 36 Atari environments and show that we reach comparable performance to their SimPLe algorithm, while our model is significantly smaller.

Results

TaskDatasetMetricValueModel
Atari GamesAtari 2600 FreewayScore29Discrete Latent Space World Model (VQ-VAE)
Atari GamesAtari 2600 PongScore20.2Discrete Latent Space World Model (VQ-VAE)
Atari GamesAtari 2600 BreakoutScore11.6Discrete Latent Space World Model (VQ-VAE)
Atari GamesAtari 2600 Crazy ClimberScore59609.4Discrete Latent Space World Model (VQ-VAE)
Atari GamesAtari 2600 SeaquestScore635Discrete Latent Space World Model (VQ-VAE)
Atari GamesAtari 2600 Bank HeistScore121.6Discrete Latent Space World Model (VQ-VAE)
Video GamesAtari 2600 FreewayScore29Discrete Latent Space World Model (VQ-VAE)
Video GamesAtari 2600 PongScore20.2Discrete Latent Space World Model (VQ-VAE)
Video GamesAtari 2600 BreakoutScore11.6Discrete Latent Space World Model (VQ-VAE)
Video GamesAtari 2600 Crazy ClimberScore59609.4Discrete Latent Space World Model (VQ-VAE)
Video GamesAtari 2600 SeaquestScore635Discrete Latent Space World Model (VQ-VAE)
Video GamesAtari 2600 Bank HeistScore121.6Discrete Latent Space World Model (VQ-VAE)

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17