URLB

Unsupervised Reinforcement Learning Benchmark

Introduced 2021-10-28

URLB consists of two phases: reward-free pre-training and downstream task adaptation with extrinsic rewards. Building on the DeepMind Control Suite, it provides twelve continuous control tasks from three domains for evaluation.