James Thewlis, Hakan Bilen, Andrea Vedaldi
Learning automatically the structure of object categories remains an important open problem in computer vision. In this paper, we propose a novel unsupervised approach that can discover and learn landmarks in object categories, thus characterizing their structure. Our approach is based on factorizing image deformations, as induced by a viewpoint change or an object deformation, by learning a deep neural network that detects landmarks consistently with such visual effects. Furthermore, we show that the learned landmarks establish meaningful correspondences between different object instances in a category without having to impose this requirement explicitly. We assess the method qualitatively on a variety of object types, natural and man-made. We also show that our unsupervised landmarks are highly predictive of manually-annotated landmarks in face benchmark datasets, and can be used to regress these with a high degree of accuracy.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Facial Recognition and Modelling | 300W | NME | 7.97 | FSE |
| Facial Recognition and Modelling | MAFL | NME | 6.32 | Thewlis2017unsupervised |
| Facial Recognition and Modelling | MAFL | NME | 6.67 | FSE |
| Facial Recognition and Modelling | AFLW-MTFL | NME | 10.53 | FSE |
| Facial Recognition and Modelling | MAFL Unaligned | NME | 31.3 | ULD |
| Facial Landmark Detection | 300W | NME | 7.97 | FSE |
| Facial Landmark Detection | MAFL | NME | 6.32 | Thewlis2017unsupervised |
| Facial Landmark Detection | MAFL | NME | 6.67 | FSE |
| Facial Landmark Detection | AFLW-MTFL | NME | 10.53 | FSE |
| Facial Landmark Detection | MAFL Unaligned | NME | 31.3 | ULD |
| Face Reconstruction | 300W | NME | 7.97 | FSE |
| Face Reconstruction | MAFL | NME | 6.32 | Thewlis2017unsupervised |
| Face Reconstruction | MAFL | NME | 6.67 | FSE |
| Face Reconstruction | AFLW-MTFL | NME | 10.53 | FSE |
| Face Reconstruction | MAFL Unaligned | NME | 31.3 | ULD |
| 3D | 300W | NME | 7.97 | FSE |
| 3D | MAFL | NME | 6.32 | Thewlis2017unsupervised |
| 3D | MAFL | NME | 6.67 | FSE |
| 3D | AFLW-MTFL | NME | 10.53 | FSE |
| 3D | MAFL Unaligned | NME | 31.3 | ULD |
| 3D Face Modelling | 300W | NME | 7.97 | FSE |
| 3D Face Modelling | MAFL | NME | 6.32 | Thewlis2017unsupervised |
| 3D Face Modelling | MAFL | NME | 6.67 | FSE |
| 3D Face Modelling | AFLW-MTFL | NME | 10.53 | FSE |
| 3D Face Modelling | MAFL Unaligned | NME | 31.3 | ULD |
| 3D Face Reconstruction | 300W | NME | 7.97 | FSE |
| 3D Face Reconstruction | MAFL | NME | 6.32 | Thewlis2017unsupervised |
| 3D Face Reconstruction | MAFL | NME | 6.67 | FSE |
| 3D Face Reconstruction | AFLW-MTFL | NME | 10.53 | FSE |
| 3D Face Reconstruction | MAFL Unaligned | NME | 31.3 | ULD |