(2+1)D Convolution

Computer VisionIntroduced 200017 papers

Description

A (2+1)D Convolution is a type of convolution used for action recognition convolutional neural networks, with a spatiotemporal volume. As opposed to applying a 3D Convolution over the entire volume, which can be computationally expensive and lead to overfitting, a (2+1)D convolution splits computation into two convolutions: a spatial 2D convolution followed by a temporal 1D convolution.

Papers Using This Method

Multi-Microphone and Multi-Modal Emotion Recognition in Reverberant Environment2024-09-14 Use of a Multiscale Vision Transformer to predict Nursing Activities Score from Low Resolution Thermal Videos in an Intensive Care Unit2024-05-30 Temporal Contrastive Learning with Curriculum2022-09-02 Motion-Focused Contrastive Learning of Video Representations2022-01-11 ByteTrack: Multi-Object Tracking by Associating Every Detection Box2021-10-13 Self-Supervised Video Representation Learning with Meta-Contrastive Network2021-08-19 Spatiotemporal Contrastive Learning of Facial Expressions in Videos2021-08-06 Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting2021-06-18 You Only Learn One Representation: Unified Network for Multiple Tasks2021-05-10 The 3TConv: An Intrinsic Approach to Explainable 3D CNNs2021-01-01 Self-supervised Video Representation Learning by Uncovering Spatio-temporal Statistics2020-08-31 Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition2020-08-03 RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices2020-07-20 Neural Graph Collaborative Filtering2019-05-20 Spatiotemporal CNNs for Pornography Detection in Videos2018-10-24 Multi-Fiber Networks for Video Recognition2018-07-30 A Closer Look at Spatiotemporal Convolutions for Action Recognition2017-11-30