TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CT-Net: Channel Tensorization Network for Video Classifica...

CT-Net: Channel Tensorization Network for Video Classification

Kunchang Li, Xianhang Li, Yali Wang, Jun Wang, Yu Qiao

2021-06-03ICLR 2021 1Action ClassificationVideo ClassificationAction RecognitionClassification
PaperPDFCode(official)

Abstract

3D convolution is powerful for video classification but often computationally expensive, recent studies mainly focus on decomposing it on spatial-temporal and/or channel dimensions. Unfortunately, most approaches fail to achieve a preferable balance between convolutional efficiency and feature-interaction sufficiency. For this reason, we propose a concise and novel Channel Tensorization Network (CT-Net), by treating the channel dimension of input feature as a multiplication of K sub-dimensions. On one hand, it naturally factorizes convolution in a multiple dimension way, leading to a light computation burden. On the other hand, it can effectively enhance feature interaction from different channels, and progressively enlarge the 3D receptive field of such interaction to boost classification accuracy. Furthermore, we equip our CT-Module with a Tensor Excitation (TE) mechanism. It can learn to exploit spatial, temporal and channel attention in a high-dimensional manner, to improve the cooperative power of all the feature dimensions in our CT-Module. Finally, we flexibly adapt ResNet as our CT-Net. Extensive experiments are conducted on several challenging video benchmarks, e.g., Kinetics-400, Something-Something V1 and V2. Our CT-Net outperforms a number of recent SOTA approaches, in terms of accuracy and/or efficiency. The codes and models will be available on https://github.com/Andy1621/CT-Net.

Results

TaskDatasetMetricValueModel
VideoKinetics-400Acc@179.8CT-Net Ensemble
Activity RecognitionSomething-Something V1Top 1 Accuracy56.6CT-Net Ensemble (R50, 8+12+16+24)
Activity RecognitionSomething-Something V2GFLOPs280CT-Net Ensemble (R50, 8+12+16+24)
Activity RecognitionSomething-Something V2Parameters83.8CT-Net Ensemble (R50, 8+12+16+24)
Activity RecognitionSomething-Something V2Top-1 Accuracy67.8CT-Net Ensemble (R50, 8+12+16+24)
Activity RecognitionSomething-Something V2Top-5 Accuracy91.1CT-Net Ensemble (R50, 8+12+16+24)
Action RecognitionSomething-Something V1Top 1 Accuracy56.6CT-Net Ensemble (R50, 8+12+16+24)
Action RecognitionSomething-Something V2GFLOPs280CT-Net Ensemble (R50, 8+12+16+24)
Action RecognitionSomething-Something V2Parameters83.8CT-Net Ensemble (R50, 8+12+16+24)
Action RecognitionSomething-Something V2Top-1 Accuracy67.8CT-Net Ensemble (R50, 8+12+16+24)
Action RecognitionSomething-Something V2Top-5 Accuracy91.1CT-Net Ensemble (R50, 8+12+16+24)

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13Fuzzy Classification Aggregation for a Continuum of Agents2025-07-06Hybrid-View Attention for csPCa Classification in TRUS2025-07-04Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01