Generalized Mean Pooling

Computer VisionIntroduced 20006 papers

Description

Generalized Mean Pooling (GeM) computes the generalized mean of each channel in a tensor. Formally:

$\textbf{e} = \left[\left(\frac{1}{|\Omega|}\sum\_{u\in{\Omega}}x^{p}\_{cu}\right)^{\frac{1}{p}}\right]\_{c=1,\cdots,C}$

where $p > 0$ is a parameter. Setting this exponent as $p > 1$ increases the contrast of the pooled feature map and focuses on the salient features of the image. GeM is a generalization of the average pooling commonly used in classification networks ( $p = 1$ ) and of spatial max-pooling layer ( $p = \infty$ ).

Source: MultiGrain

Image Source: Eva Mohedano

Papers Using This Method

Efficient Probabilistic Modeling of Crystallization at Mesoscopic Scale2024-05-26 MinkUNeXt: Point Cloud-based Large-scale Place Recognition using 3D Sparse Convolutions2024-03-12 GaitMM: Multi-Granularity Motion Sequence Learning for Gait Recognition2022-09-18 Deep Learning Based Image Retrieval in the JPEG Compressed Domain2021-07-08 Unifying Deep Local and Global Features for Image Search2020-01-14 MultiGrain: a unified image embedding for classes and instances2019-02-14