AndroidWorld
AndroidWorld is an environment for building and benchmarking autonomous computer control agents.
It runs on a live Android emulator and contains a highly reproducible benchmark of 116 hand-crafted tasks across 20 apps, which are dynamically instantiated with randomly-generated parameters to create millions of unique task variations.
In addition to the built-in tasks, AndroidWorld also supports the popular web benchmark, MiniWoB++ from Liu et al..
Key features of AndroidWorld include:
š 116 diverse tasks across 20 real-world apps š² Dynamic task instantiation for millions of unique variations š Durable reward signals for reliable evaluation š Open environment with access to millions of Android apps and websites š¾ Lightweight footprint (2 GB memory, 8 GB disk) š§ Extensible design to easily add new tasks and benchmarks š„ļø Integration with MiniWoB++ web-based tasks