Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston
When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencoder, and a deep convolutional neural network architecture for object classification. We address challenges unique to voxel-based representations, and empirically evaluate our models on the ModelNet benchmark, where we demonstrate a 51.5% relative improvement in the state of the art for object classification.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Shape Representation Of 3D Point Clouds | ModelNet40 | Mean Accuracy | 91.33 | VRN (multiple views) |
| Shape Representation Of 3D Point Clouds | ModelNet40 | Mean Accuracy | 88.98 | VRN (single view) |
| 3D Point Cloud Classification | ModelNet40 | Mean Accuracy | 91.33 | VRN (multiple views) |
| 3D Point Cloud Classification | ModelNet40 | Mean Accuracy | 88.98 | VRN (single view) |
| 3D Point Cloud Reconstruction | ModelNet40 | Mean Accuracy | 91.33 | VRN (multiple views) |
| 3D Point Cloud Reconstruction | ModelNet40 | Mean Accuracy | 88.98 | VRN (single view) |