Description
About DeepCubeAI
DeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems [paper] [Code] [PyPI] [Slides] [Poster]
ā
Key Contributions
DeepCubeAI is comprised of three key components:
-
Discrete World Model
- Learns a world model that represents states in a discrete latent space.
- This approach tackles two challenges: model degradation and state re-identification.
- Prediction errors less than 0.5 are corrected by rounding.
- Re-identifies states by comparing two binary vectors.
-
Generalizable Heuristic Function
- Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a heuristic function that generalizes over start and goal states.
-
Optimized Search
- Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses Q* search, a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning. ā
Main Results
- Accurate reconstruction of ground truth images after thousands of timesteps.
- Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.
- 99.9% success on Rubik's Cube with reversed start/goal states.
- Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.