TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Muesli: Combining Improvements in Policy Optimization

Muesli: Combining Improvements in Policy Optimization

Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent SIfre, Theophane Weber, David Silver, Hado van Hasselt

2021-04-13Atari GamesContinuous Control
PaperPDFCodeCode

Abstract

We propose a novel policy update that combines regularized policy optimization with model learning as an auxiliary loss. The update (henceforth Muesli) matches MuZero's state-of-the-art performance on Atari. Notably, Muesli does so without using deep search: it acts directly with a policy network and has computation speed comparable to model-free baselines. The Atari results are complemented by extensive ablations, and by additional results on continuous control and 9x9 Go.

Results

TaskDatasetMetricValueModel
Atari Gamesatari gameHuman World Record Breakthrough5Muesli
Video Gamesatari gameHuman World Record Breakthrough5Muesli

Related Papers

Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)2025-07-17Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across Domains2025-07-02rQdia: Regularizing Q-Value Distributions With Image Augmentation2025-06-26A Principled Path to Fitted Distributional Evaluation2025-06-24Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity2025-06-20Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute2025-06-18Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic Environments2025-06-17Meta-learning how to Share Credit among Macro-Actions2025-06-16