TLA with Hierarchical Reward Functions
Reported on 3 benchmarks across 1 task · 1 paper · 2 SOTA
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Playing Games3 results
- Action Repetition· 2024-11-23SOTA0.8073
- Mean Reward· 2024-11-23SOTA-125.02
- Average Decisions· 2024-11-2338.6best: 62.31 (TLA)