Decision Transformer: Reinforcement Learning via Sequence Modeling

Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch

2021-06-02NeurIPS 2021 12Reinforcement Learning Offline RL Atari Games OpenAI Gym D4RL Language Modelling reinforcement-learning

Paper PDF Code Code Code Code(official)Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code

Abstract

We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.

Results

Task	Dataset	Metric	Value	Model
Atari Games	Atari 2600 Pong	Score	17.1	DT
Atari Games	Atari 2600 Breakout	Score	267.5	DT
Atari Games	Atari 2600 Seaquest	Score	2.4	DT
Atari Games	Atari 2600 Q*Bert	Score	25.1	DT
Video Games	Atari 2600 Pong	Score	17.1	DT
Video Games	Atari 2600 Breakout	Score	267.5	DT
Video Games	Atari 2600 Seaquest	Score	2.4	DT
Video Games	Atari 2600 Q*Bert	Score	25.1	DT
General Reinforcement Learning	D4RL	Average Reward	73.5	Decision Transformer (DT)
Offline RL	D4RL	Average Reward	73.5	Decision Transformer (DT)
MuJoCo Games	D4RL	Average Reward	72.2	Decision Transformer (DT)

Decision Transformer: Reinforcement Learning via Sequence Modeling

Abstract

Results

Related Papers

Decision Transformer: Reinforcement Learning via Sequence Modeling

Abstract

Results

Related Papers