TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/WaveMix: A Resource-efficient Neural Network for Image Ana...

WaveMix: A Resource-efficient Neural Network for Image Analysis

Pranav Jeevan, Kavitha Viswanathan, Anandu A S, Amit Sethi

2022-05-28Scene ClassificationImage ClassificationSemantic Segmentation
PaperPDFCode(official)

Abstract

We propose a novel neural architecture for computer vision -- WaveMix -- that is resource-efficient and yet generalizable and scalable. While using fewer trainable parameters, GPU RAM, and computations, WaveMix networks achieve comparable or better accuracy than the state-of-the-art convolutional neural networks, vision transformers, and token mixers for several tasks. This efficiency can translate to savings in time, cost, and energy. To achieve these gains we used multi-level two-dimensional discrete wavelet transform (2D-DWT) in WaveMix blocks, which has the following advantages: (1) It reorganizes spatial information based on three strong image priors -- scale-invariance, shift-invariance, and sparseness of edges -- (2) in a lossless manner without adding parameters, (3) while also reducing the spatial sizes of feature maps, which reduces the memory and time required for forward and backward passes, and (4) expanding the receptive field faster than convolutions do. The whole architecture is a stack of self-similar and resolution-preserving WaveMix blocks, which allows architectural flexibility for various tasks and levels of resource availability. WaveMix establishes new benchmarks for segmentation on Cityscapes; and for classification on Galaxy 10 DECals, Places-365, five EMNIST datasets, and iNAT-mini and performs competitively on other benchmarks. Our code and trained models are publicly available.

Results

TaskDatasetMetricValueModel
Semantic SegmentationCityscapes valmIoU82.7WaveMix
Semantic SegmentationCityscapes valmIoU82.6WaveMix-256/16 (Level-4)
Scene ClassificationPlaces365-StandardTop 1 Error43.55WaveMix
Image ClassificationEMNIST-BalancedAccuracy91.06WaveMixLite-128/7
Image ClassificationFashion-MNISTPercentage error5.68WaveMixLite
Image ClassificationCaltech-256Accuracy54.62WaveMixLite-256/7
Image ClassificationPlaces365-StandardTop 1 Accuracy56.45WaveMix-240/12 (level 4)
Image ClassificationEMNIST-LettersAccuracy95.96WaveMixLite-112/16
Image ClassificationCIFAR-10Percentage correct97.29WaveMixLite-144/7
Image ClassificationEMNIST-ByclassAccuracy88.43WaveMixLite-128/7
Image ClassificationiNat2021-miniTop 1 Accuracy61.75WaveMix-256/16 (level 2)
Image ClassificationEMNIST-BymergeAccuracy91.8WaveMixLite-128/16
Image ClassificationEMNIST-DigitsAccuracy (%)99.82WaveMixLite-112/16
Image ClassificationGalaxy10 DECalsPARAMS (M)28WaveMix
Image ClassificationGalaxy10 DECalsTop-1 Accuracy (%)95.42WaveMix
Image ClassificationCIFAR-100Percentage correct85.09WaveMixLite-256/7
Image ClassificationCIFAR-100Percentage correct70.2WaveMix-Lite-256/7
Image ClassificationSTL-10Percentage correct70.88WaveMixLite-256/7
Image ClassificationSVHNPercentage error1.27WaveMixLite-144/15
Image ClassificationmnistPercentage error0.25WaveMixLite
10-shot image generationCityscapes valmIoU82.7WaveMix
10-shot image generationCityscapes valmIoU82.6WaveMix-256/16 (Level-4)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17