WaveMix: A Resource-efficient Neural Network for Image Analysis

Pranav Jeevan, Kavitha Viswanathan, Anandu A S, Amit Sethi

2022-05-28Scene Classification Image Classification Semantic Segmentation

Abstract

We propose a novel neural architecture for computer vision -- WaveMix -- that is resource-efficient and yet generalizable and scalable. While using fewer trainable parameters, GPU RAM, and computations, WaveMix networks achieve comparable or better accuracy than the state-of-the-art convolutional neural networks, vision transformers, and token mixers for several tasks. This efficiency can translate to savings in time, cost, and energy. To achieve these gains we used multi-level two-dimensional discrete wavelet transform (2D-DWT) in WaveMix blocks, which has the following advantages: (1) It reorganizes spatial information based on three strong image priors -- scale-invariance, shift-invariance, and sparseness of edges -- (2) in a lossless manner without adding parameters, (3) while also reducing the spatial sizes of feature maps, which reduces the memory and time required for forward and backward passes, and (4) expanding the receptive field faster than convolutions do. The whole architecture is a stack of self-similar and resolution-preserving WaveMix blocks, which allows architectural flexibility for various tasks and levels of resource availability. WaveMix establishes new benchmarks for segmentation on Cityscapes; and for classification on Galaxy 10 DECals, Places-365, five EMNIST datasets, and iNAT-mini and performs competitively on other benchmarks. Our code and trained models are publicly available.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	Cityscapes val	mIoU	82.7	WaveMix
Semantic Segmentation	Cityscapes val	mIoU	82.6	WaveMix-256/16 (Level-4)
Scene Classification	Places365-Standard	Top 1 Error	43.55	WaveMix
Image Classification	EMNIST-Balanced	Accuracy	91.06	WaveMixLite-128/7
Image Classification	Fashion-MNIST	Percentage error	5.68	WaveMixLite
Image Classification	Caltech-256	Accuracy	54.62	WaveMixLite-256/7
Image Classification	Places365-Standard	Top 1 Accuracy	56.45	WaveMix-240/12 (level 4)
Image Classification	EMNIST-Letters	Accuracy	95.96	WaveMixLite-112/16
Image Classification	CIFAR-10	Percentage correct	97.29	WaveMixLite-144/7
Image Classification	EMNIST-Byclass	Accuracy	88.43	WaveMixLite-128/7
Image Classification	iNat2021-mini	Top 1 Accuracy	61.75	WaveMix-256/16 (level 2)
Image Classification	EMNIST-Bymerge	Accuracy	91.8	WaveMixLite-128/16
Image Classification	EMNIST-Digits	Accuracy (%)	99.82	WaveMixLite-112/16
Image Classification	Galaxy10 DECals	PARAMS (M)	28	WaveMix
Image Classification	Galaxy10 DECals	Top-1 Accuracy (%)	95.42	WaveMix
Image Classification	CIFAR-100	Percentage correct	85.09	WaveMixLite-256/7
Image Classification	CIFAR-100	Percentage correct	70.2	WaveMix-Lite-256/7
Image Classification	STL-10	Percentage correct	70.88	WaveMixLite-256/7
Image Classification	SVHN	Percentage error	1.27	WaveMixLite-144/15
Image Classification	mnist	Percentage error	0.25	WaveMixLite
10-shot image generation	Cityscapes val	mIoU	82.7	WaveMix
10-shot image generation	Cityscapes val	mIoU	82.6	WaveMix-256/16 (Level-4)

WaveMix: A Resource-efficient Neural Network for Image Analysis

Abstract

Results

Related Papers

WaveMix: A Resource-efficient Neural Network for Image Analysis

Abstract

Results

Related Papers