TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/TAM

TAM

Temporal Adaptive Module

GeneralIntroduced 200032 papers
Source Paper

Description

TAM is designed to capture complex temporal relationships both efficiently and flexibly, It adopts an adaptive kernel instead of self-attention to capture global contextual information, with lower time complexity than GLTR.

TAM has two branches, a local branch and a global branch. Given the input feature map X∈RC×T×H×WX\in \mathbb{R}^{C\times T\times H\times W}X∈RC×T×H×W, global spatial average pooling GAP\text{GAP}GAP is first applied to the feature map to ensure TAM has a low computational cost. Then the local branch in TAM employs several 1D convolutions with ReLU nonlinearity across the temporal domain to produce location-sensitive importance maps for enhancing frame-wise features. The local branch can be written as \begin{align} s &= \sigma(\text{Conv1D}(\delta(\text{Conv1D}(\text{GAP}(X))))) \end{align} \begin{align} X^1 &= s X \end{align} Unlike the local branch, the global branch is location invariant and focuses on generating a channel-wise adaptive kernel based on global temporal information in each channel. For the ccc-th channel, the kernel can be written as

\begin{align} \Theta_c = \text{Softmax}(\text{FC}_2(\delta(\text{FC}_1(\text{GAP}(X)_c)))) \end{align}

where Θc∈RK\Theta_c \in \mathbb{R}^{K}Θc​∈RK and KKK is the adaptive kernel size. Finally, TAM convolves the adaptive kernel Θ\ThetaΘ with Xout1 X_\text{out}^1Xout1​: \begin{align} Y = \Theta \otimes X^1 \end{align}

With the help of the local branch and global branch, TAM can capture the complex temporal structures in video and enhance per-frame features at low computational cost. Due to its flexibility and lightweight design, TAM can be added to any existing 2D CNNs.

Papers Using This Method

Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition2025-06-26Motion-enhancement to Echocardiography Segmentation via Inserting a Temporal Attention Module: An Efficient, Adaptable, and Scalable Approach2025-01-24Threshold Attention Network for Semantic Segmentation of Remote Sensing Images2025-01-14Torque-Aware Momentum2024-12-25EntityCLIP: Entity-Centric Image-Text Matching via Multimodal Attentive Contrastive Learning2024-10-23Parameter Estimation in Optimal Tolling for Traffic Networks Under the Markovian Traffic Equilibrium2024-09-29User Story Tutor (UST) to Support Agile Software Developers2024-06-24Digital Health and Indoor Air Quality: An IoT-Driven Human-Centred Visualisation Platform for Behavioural Change and Technology Acceptance2024-05-20Enhancing Multivariate Time Series Forecasting with Mutual Information-driven Cross-Variable and Temporal Modeling2024-03-01Minimizing Energy Consumption in MU-MIMO via Antenna Muting by Neural Networks with Asymmetric Loss2023-06-08Truncated Affinity Maximization: One-class Homophily Modeling for Graph Anomaly Detection2023-05-29Improve Video Representation with Temporal Adversarial Augmentation2023-04-28Arc-based Traffic Assignment: Equilibrium Characterization and Learning2023-04-10Spectral Gap Regularization of Neural Networks2023-04-06FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection2023-01-08MangngalApp -- An integrated package of technology for COVID-19 response and rural development: Acceptability and usability using TAM2023-01-07Consumer acceptance of the use of artificial intelligence in online shopping: evidence from Hungary2022-12-26Dual Prototype Attention for Unsupervised Video Object Segmentation2022-11-22Trust in AI and Its Role in the Acceptance of AI Technologies2022-03-23Motion-driven Visual Tempo Learning for Video-based Action Recognition2022-02-24