TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Query Twice: Dual Mixture Attention Meta Learning for Vide...

Query Twice: Dual Mixture Attention Meta Learning for Video Summarization

Junyan Wang, Yang Bai, Yang Long, BingZhang Hu, Zhenhua Chai, Yu Guan, Xiaolin Wei

2020-08-19Meta-LearningSupervised Video SummarizationVideo Summarization
PaperPDF

Abstract

Video summarization aims to select representative frames to retain high-level information, which is usually solved by predicting the segment-wise importance score via a softmax function. However, softmax function suffers in retaining high-rank representations for complex visual or sequential information, which is known as the Softmax Bottleneck problem. In this paper, we propose a novel framework named Dual Mixture Attention (DMASum) model with Meta Learning for video summarization that tackles the softmax bottleneck problem, where the Mixture of Attention layer (MoA) effectively increases the model capacity by employing twice self-query attention that can capture the second-order changes in addition to the initial query-key attention, and a novel Single Frame Meta Learning rule is then introduced to achieve more generalization to small datasets with limited training sources. Furthermore, the DMASum significantly exploits both visual and sequential attention that connects local key-frame and global attention in an accumulative way. We adopt the new evaluation protocol on two public datasets, SumMe, and TVSum. Both qualitative and quantitative experiments manifest significant improvements over the state-of-the-art methods.

Results

TaskDatasetMetricValueModel
VideoTvSumF1-score (Canonical)61.4DMASum
VideoTvSumKendall's Tau0.203DMASum
VideoTvSumSpearman's Rho0.267DMASum
VideoSumMeF1-score (Canonical)54.3DMASum
VideoSumMeKendall's Tau0.063DMASum
VideoSumMeSpearman's Rho0.089DMASum
Video SummarizationTvSumF1-score (Canonical)61.4DMASum
Video SummarizationTvSumKendall's Tau0.203DMASum
Video SummarizationTvSumSpearman's Rho0.267DMASum
Video SummarizationSumMeF1-score (Canonical)54.3DMASum
Video SummarizationSumMeKendall's Tau0.063DMASum
Video SummarizationSumMeSpearman's Rho0.089DMASum

Related Papers

Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16CLID-MU: Cross-Layer Information Divergence Based Meta Update Strategy for Learning with Noisy Labels2025-07-16Mixture of Experts in Large Language Models2025-07-15Iceberg: Enhancing HLS Modeling with Synthetic Data2025-07-14Meta-Reinforcement Learning for Fast and Data-Efficient Spectrum Allocation in Dynamic Wireless Networks2025-07-13Geo-ORBIT: A Federated Digital Twin Framework for Scene-Adaptive Lane Geometry Detection2025-07-11The Bayesian Approach to Continual Learning: An Overview2025-07-11