NeoRL-2
NeoRL-2 includes new task scenarios that better reflect real-world task properties and includes traditional control methods as the data-collecting method. In summary, our contributions are as follows:
-
Tasks in NeoRL-2 cover a wider range of application domains, including robotics, aircraft, industrial pipelines, controllable nuclear fusion, healthcare, etc., encompassing key features such as delays, external factors, and safety constraints.
-
The data-collecting method in NeoRL-2 better aligns with real-world scenarios, employing deterministic methods for sampling. In some specific tasks, classical feedback controllers, such as Proportional-Integral-Derivative (PID) controller, are introduced.
-
We conducted experiments on these tasks using state-of-the-art (SOTA) offline RL algorithms and found that in most tasks, the trained policy of the current offline RL algorithms did not significantly outperform the behavior policy.
By extending near real-world tasks in NeoRL-2, we hope that the development and implementation of RL in real-world scenarios can take into account these challenges and tackle more realistic domains.