Description
CodeSLAM represents the 3D geometry of a scene using the latent space of a variational autoencoder. The depth thus becomes a function of the RGB image and the unknown code, . During training time, the weights of the network are learnt by training the generator and encoder using a standard autoencoding task. At test time the code and the pose of the images is found by optimizing the reprojection error over multiple images.