N. Mazyavkina, S. Moustafa, I. Trofimov, E. Burnaev
Reinforcement learning (RL) enjoyed significant progress over the last years. One of the most important steps forward was the wide application of neural networks. However, architectures of these neural networks are typically constructed manually. In this work, we study recently proposed neural architecture search (NAS) methods for optimizing the architecture of RL agents. We carry out experiments on the Atari benchmark and conclude that modern NAS methods find architectures of RL agents outperforming a manually selected one.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Atari Games | Atari 2600 Freeway | Score | 22 | ENAS |
| Atari Games | Atari 2600 Freeway | Score | 22 | SPOS |
| Atari Games | Atari 2600 Breakout | Score | 180.6 | SPOS |
| Atari Games | Atari 2600 Breakout | Score | 161.1 | ENAS Search space 1 |
| Atari Games | Atari 2600 Breakout | Score | 144.4 | SPOS Search space 1 |
| Atari Games | Atari 2600 Breakout | Score | 91.4 | ENAS |
| Video Games | Atari 2600 Freeway | Score | 22 | ENAS |
| Video Games | Atari 2600 Freeway | Score | 22 | SPOS |
| Video Games | Atari 2600 Breakout | Score | 180.6 | SPOS |
| Video Games | Atari 2600 Breakout | Score | 161.1 | ENAS Search space 1 |
| Video Games | Atari 2600 Breakout | Score | 144.4 | SPOS Search space 1 |
| Video Games | Atari 2600 Breakout | Score | 91.4 | ENAS |