Grigorios Chrysos, Stylianos Moschoglou, Giorgos Bouritsas, Jiankang Deng, Yannis Panagakis, Stefanos Zafeiriou
Deep Convolutional Neural Networks (DCNNs) are currently the method of choice both for generative, as well as for discriminative learning in computer vision and machine learning. The success of DCNNs can be attributed to the careful selection of their building blocks (e.g., residual blocks, rectifiers, sophisticated normalization schemes, to mention but a few). In this paper, we propose $\Pi$-Nets, a new class of function approximators based on polynomial expansions. $\Pi$-Nets are polynomial neural networks, i.e., the output is a high-order polynomial of the input. The unknown parameters, which are naturally represented by high-order tensors, are estimated through a collective tensor factorization with factors sharing. We introduce three tensor decompositions that significantly reduce the number of parameters and show how they can be efficiently implemented by hierarchical neural networks. We empirically demonstrate that $\Pi$-Nets are very expressive and they even produce good results without the use of non-linear activation functions in a large battery of tasks and signals, i.e., images, graphs, and audio. When used in conjunction with activation functions, $\Pi$-Nets produce state-of-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning. The source code is available at \url{https://github.com/grigorisg9gr/polynomial_nets}.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Facial Recognition and Modelling | AgeDB-30 | Accuracy | 0.98467 | Prodpoly |
| Facial Recognition and Modelling | LFW | Accuracy | 0.99833 | Prodpoly |
| Facial Recognition and Modelling | CALFW | Accuracy | 0.96233 | Prodpoly |
| Image Generation | CIFAR-10 | FID | 16.79 | ProdPoly |
| Image Generation | CIFAR-10 | FID | 40.45 | ProdPoly no activation functions |
| Image Generation | CIFAR-10 | FID | 36.77 | ProdPoly no activation functions |
| Image Generation | CIFAR-10 | Inception score | 7.5 | ProdPoly no activation functions |
| Image Classification | CIFAR-10 | Percentage correct | 94.9 | Prodpoly |
| Face Reconstruction | AgeDB-30 | Accuracy | 0.98467 | Prodpoly |
| Face Reconstruction | LFW | Accuracy | 0.99833 | Prodpoly |
| Face Reconstruction | CALFW | Accuracy | 0.96233 | Prodpoly |
| Face Recognition | AgeDB-30 | Accuracy | 0.98467 | Prodpoly |
| Face Recognition | LFW | Accuracy | 0.99833 | Prodpoly |
| Face Recognition | CALFW | Accuracy | 0.96233 | Prodpoly |
| 3D | AgeDB-30 | Accuracy | 0.98467 | Prodpoly |
| 3D | LFW | Accuracy | 0.99833 | Prodpoly |
| 3D | CALFW | Accuracy | 0.96233 | Prodpoly |
| 3D Face Modelling | AgeDB-30 | Accuracy | 0.98467 | Prodpoly |
| 3D Face Modelling | LFW | Accuracy | 0.99833 | Prodpoly |
| 3D Face Modelling | CALFW | Accuracy | 0.96233 | Prodpoly |
| 3D Face Reconstruction | AgeDB-30 | Accuracy | 0.98467 | Prodpoly |
| 3D Face Reconstruction | LFW | Accuracy | 0.99833 | Prodpoly |
| 3D Face Reconstruction | CALFW | Accuracy | 0.96233 | Prodpoly |
| Conditional Image Generation | CIFAR-10 | FID | 36.77 | ProdPoly no activation functions |
| Conditional Image Generation | CIFAR-10 | Inception score | 7.5 | ProdPoly no activation functions |