TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Boundary-sensitive Pre-training for Temporal Localization ...

Boundary-sensitive Pre-training for Temporal Localization in Videos

Mengmeng Xu, Juan-Manuel Perez-Rua, Victor Escorcia, Brais Martinez, Xiatian Zhu, Li Zhang, Bernard Ghanem, Tao Xiang

2020-11-21ICCV 2021 10Action ClassificationTemporal LocalizationGeneral ClassificationClassificationTemporal Action Localization
PaperPDFCode

Abstract

Many video analysis tasks require temporal localization thus detection of content changes. However, most existing models developed for these tasks are pre-trained on general video action classification tasks. This is because large scale annotation of temporal boundaries in untrimmed videos is expensive. Therefore no suitable datasets exist for temporal boundary-sensitive pre-training. In this paper for the first time, we investigate model pre-training for temporal localization by introducing a novel boundary-sensitive pretext (BSP) task. Instead of relying on costly manual annotations of temporal boundaries, we propose to synthesize temporal boundaries in existing video action classification datasets. With the synthesized boundaries, BSP can be simply conducted via classifying the boundary types. This enables the learning of video representations that are much more transferable to downstream temporal localization tasks. Extensive experiments show that the proposed BSP is superior and complementary to the existing action classification based pre-training counterpart, and achieves new state-of-the-art performance on several temporal localization tasks.

Results

TaskDatasetMetricValueModel
VideoActivityNet-1.3mAP34.75G-TAD+BSP
VideoActivityNet-1.3mAP IOU@0.550.94G-TAD+BSP
VideoActivityNet-1.3mAP IOU@0.7535.61G-TAD+BSP
VideoActivityNet-1.3mAP IOU@0.957.98G-TAD+BSP
Temporal Action LocalizationActivityNet-1.3mAP34.75G-TAD+BSP
Temporal Action LocalizationActivityNet-1.3mAP IOU@0.550.94G-TAD+BSP
Temporal Action LocalizationActivityNet-1.3mAP IOU@0.7535.61G-TAD+BSP
Temporal Action LocalizationActivityNet-1.3mAP IOU@0.957.98G-TAD+BSP
Zero-Shot LearningActivityNet-1.3mAP34.75G-TAD+BSP
Zero-Shot LearningActivityNet-1.3mAP IOU@0.550.94G-TAD+BSP
Zero-Shot LearningActivityNet-1.3mAP IOU@0.7535.61G-TAD+BSP
Zero-Shot LearningActivityNet-1.3mAP IOU@0.957.98G-TAD+BSP
Action LocalizationActivityNet-1.3mAP34.75G-TAD+BSP
Action LocalizationActivityNet-1.3mAP IOU@0.550.94G-TAD+BSP
Action LocalizationActivityNet-1.3mAP IOU@0.7535.61G-TAD+BSP
Action LocalizationActivityNet-1.3mAP IOU@0.957.98G-TAD+BSP

Related Papers

Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13Fuzzy Classification Aggregation for a Continuum of Agents2025-07-06Hybrid-View Attention for csPCa Classification in TRUS2025-07-04Devising a solution to the problems of Cancer awareness in Telangana2025-06-26