POPGym

Partially Observable Process Gym

EnvironmentMITIntroduced 2022-09-22

POPGym is designed to benchmark memory in deep reinforcement learning. It contains a set of environments and a collection of memory model baselines. The environments are all Partially Observable Markov Decision Process (POMDP) environments following the Openai Gym interface. Our environments follow a few basic tenets:

  1. Painless Setup - popgym environments require only gym, numpy, and mazelib as dependencies
  2. Laptop-Sized Tasks - Most tasks can be solved in less than a day on the CPU
  3. True Generalization - All environments are heavily randomized.

The paper uses 15M environment steps for each trial.