TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Ghost Module

Ghost Module

Computer VisionIntroduced 200030 papers
Source Paper

Description

A Ghost Module is an image block for convolutional neural network that aims to generate more features by using fewer parameters. Specifically, an ordinary convolutional layer in deep neural networks is split into two parts. The first part involves ordinary convolutions but their total number is controlled. Given the intrinsic feature maps from the first part, a series of simple linear operations are applied for generating more feature maps.

Given the widely existing redundancy in intermediate feature maps calculated by mainstream CNNs, ghost modules aim to reduce them. In practice, given the input data X∈Rc×h×wX\in\mathbb{R}^{c\times h\times w}X∈Rc×h×w, where ccc is the number of input channels and hhh and www are the height and width of the input data, respectively, the operation of an arbitrary convolutional layer for producing nnn feature maps can be formulated as

Y=X∗f+b,Y = X*f+b,Y=X∗f+b,

where ∗*∗ is the convolution operation, bbb is the bias term, Y∈Rh′×w′×nY\in\mathbb{R}^{h'\times w'\times n}Y∈Rh′×w′×n is the output feature map with nnn channels, and f∈Rc×k×k×nf\in\mathbb{R}^{c\times k\times k \times n}f∈Rc×k×k×n is the convolution filters in this layer. In addition, h′h'h′ and w′w'w′ are the height and width of the output data, and k×kk\times kk×k is the kernel size of convolution filters fff, respectively. During this convolution procedure, the required number of FLOPs can be calculated as n⋅h′⋅w′⋅c⋅k⋅kn\cdot h'\cdot w'\cdot c\cdot k\cdot kn⋅h′⋅w′⋅c⋅k⋅k, which is often as large as hundreds of thousands since the number of filters nnn and the channel number ccc are generally very large (e.g. 256 or 512).

Here, the number of parameters (in fff and bbb) to be optimized is explicitly determined by the dimensions of input and output feature maps. The output feature maps of convolutional layers often contain much redundancy, and some of them could be similar with each other. We point out that it is unnecessary to generate these redundant feature maps one by one with large number of FLOPs and parameters. Suppose that the output feature maps are ghosts of a handful of intrinsic feature maps with some cheap transformations. These intrinsic feature maps are often of smaller size and produced by ordinary convolution filters. Specifically, mmm intrinsic feature maps Y′∈Rh′×w′×mY'\in\mathbb{R}^{h'\times w'\times m}Y′∈Rh′×w′×m are generated using a primary convolution:

Y′=X∗f′,Y' = X*f',Y′=X∗f′,

where f′∈Rc×k×k×mf'\in\mathbb{R}^{c\times k\times k \times m}f′∈Rc×k×k×m is the utilized filters, m≤nm\leq nm≤n and the bias term is omitted for simplicity. The hyper-parameters such as filter size, stride, padding, are the same as those in the ordinary convolution to keep the spatial size (ie h′h'h′ and w′w'w′) of the output feature maps consistent. To further obtain the desired nnn feature maps, we apply a series of cheap linear operations on each intrinsic feature in Y′Y'Y′ to generate sss ghost features according to the following function:

yij=Φi,j(yi′),∀  i=1,...,m,    j=1,...,s,y_{ij} = \Phi_{i,j}(y'_i),\quad \forall\; i = 1,...,m,\;\; j = 1,...,s,yij​=Φi,j​(yi′​),∀i=1,...,m,j=1,...,s,

where y′_iy'\_iy′_i is the iii-th intrinsic feature map in Y′Y'Y′, Φ_i,j\Phi\_{i,j}Φ_i,j in the above function is the jjj-th (except the last one) linear operation for generating the jjj-th ghost feature map yijy_{ij}yij​, that is to say, y′_iy'\_iy′_i can have one or more ghost feature maps {y_ij}_j=1s\{y\_{ij}\}\_{j=1}^{s}{y_ij}_j=1s. The last Φ_i,s\Phi\_{i,s}Φ_i,s is the identity mapping for preserving the intrinsic feature maps. we can obtain n=m⋅sn=m\cdot sn=m⋅s feature maps Y=[y_11,y_12,⋯ ,y_ms]Y=[y\_{11},y\_{12},\cdots,y\_{ms}]Y=[y_11,y_12,⋯,y_ms] as the output data of a Ghost module. Note that the linear operations Φ\PhiΦ operate on each channel whose computational cost is much less than the ordinary convolution. In practice, there could be several different linear operations in a Ghost module, eg 3×33\times 33×3 and 5×55\times55×5 linear kernels, which will be analyzed in the experiment part.

Papers Using This Method

GRNN:Recurrent Neural Network based on Ghost Features for Video Super-Resolution2025-05-14Cross-video Identity Correlating for Person Re-identification Pre-training2024-09-27A Lightweight Insulator Defect Detection Model Based on Drone Images2024-08-26IDD-YOLOv5: A Lightweight Insulator Defect Real-time Detection Algorithm2024-08-19A lightweight YOLOv5-FFM model for occlusion pedestrian detection2024-08-13LiteYOLO-ID: A Lightweight Object Detection Network for Insulator Defect Detection2024-06-24Multimodal Emotion Recognition based on Facial Expressions, Speech, and EEG2024-06-11Ghost-Stereo: GhostNet-based Cost Volume Enhancement and Aggregation for Stereo Matching Networks2024-05-23GRAN: Ghost Residual Attention Network for Single Image Super Resolution2023-02-28Short-Term Memory Convolutions2023-02-08GhostNetV2: Enhance Cheap Operation with Long-Range Attention2022-11-23RepGhost: A Hardware-Efficient Ghost Module via Re-parameterization2022-11-11Network Amplification With Efficient MACs Allocation2022-07-01YOLOv5s-GTB: light-weighted and improved YOLOv5s for bridge crack detection2022-06-03MoCoViT: Mobile Convolutional Vision Transformer2022-05-25Efficient Convolutional Neural Networks on Raspberry Pi for Image Classification2022-04-02ThreshNet: An Efficient DenseNet Using Threshold Mechanism to Reduce Connections2022-01-09GPU-Net: Lightweight U-Net with more diverse features2022-01-07Ghost-dil-NetVLAD: A Lightweight Neural Network for Visual Place Recognition2021-12-22GhostShiftAddNet: More Features from Energy-Efficient Operations2021-09-20