TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Recurrent Independent Mechanisms

Recurrent Independent Mechanisms

Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf

2019-09-24ICLR 2021 1Atari Games
PaperPDFCodeCodeCode

Abstract

Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes. We propose Recurrent Independent Mechanisms (RIMs), a new recurrent architecture in which multiple groups of recurrent cells operate with nearly independent transition dynamics, communicate only sparingly through the bottleneck of attention, and are only updated at time steps where they are most relevant. We show that this leads to specialization amongst the RIMs, which in turn allows for dramatically improved generalization on tasks where some factors of variation differ systematically between training and evaluation.

Results

TaskDatasetMetricValueModel
Atari GamesAtari 2600 Beam RiderScore5320RIMs-PPO
Atari GamesAtari 2600 ZaxxonScore15000RIMs-PPO
Atari GamesAtari 2600 Up and DownScore390000RIMs-PPO
Video GamesAtari 2600 Beam RiderScore5320RIMs-PPO
Video GamesAtari 2600 ZaxxonScore15000RIMs-PPO
Video GamesAtari 2600 Up and DownScore390000RIMs-PPO

Related Papers

Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across Domains2025-07-02A Principled Path to Fitted Distributional Evaluation2025-06-24Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic Environments2025-06-17Meta-learning how to Share Credit among Macro-Actions2025-06-16TextAtari: 100K Frames Game Playing with Language Agents2025-06-04Improving Performance of Spike-based Deep Q-Learning using Ternary Neurons2025-06-03Automatic Reward Shaping from Confounded Offline Data2025-05-16Unraveling the Rainbow: can value-based methods schedule?2025-05-06