TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Deep Reinforcement Learning for Unsupervised Video Summari...

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward

Kaiyang Zhou, Yu Qiao, Tao Xiang

2017-12-29Sequential Decision MakingReinforcement LearningUnsupervised Video SummarizationSupervised Video SummarizationDecision MakingVideo Summarizationreinforcement-learning
PaperPDFCode(official)CodeCodeCodeCodeCode

Abstract

Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos. In this paper, we formulate video summarization as a sequential decision-making process and develop a deep summarization network (DSN) to summarize videos. DSN predicts for each video frame a probability, which indicates how likely a frame is selected, and then takes actions based on the probability distributions to select frames, forming video summaries. To train our DSN, we propose an end-to-end, reinforcement learning-based framework, where we design a novel reward function that jointly accounts for diversity and representativeness of generated summaries and does not rely on labels or user interactions at all. During training, the reward function judges how diverse and representative the generated summaries are, while DSN strives for earning higher rewards by learning to produce more diverse and more representative summaries. Since labels are not required, our method can be fully unsupervised. Extensive experiments on two benchmark datasets show that our unsupervised method not only outperforms other state-of-the-art unsupervised methods, but also is comparable to or even superior than most of published supervised approaches.

Results

TaskDatasetMetricValueModel
VideoTvSumF1-score57.6DR-DSN
VideoTvSumKendall's Tau0.02DR-DSN
VideoTvSumParameters (M)2.63DR-DSN
VideoTvSumSpearman's Rho0.026DR-DSN
VideoTvSumtraining time (s)58.8DR-DSN
VideoSumMeF1-score41.4DR-DSN
VideoSumMeParameters (M)2.63DR-DSN
VideoSumMetraining time (s)19.8DR-DSN
VideoTvSumF1-score (Augmented)59.8DR-DSN
VideoTvSumF1-score (Canonical)58.1DR-DSN
VideoSumMeF1-score (Augmented)43.9DR-DSN
VideoSumMeF1-score (Canonical)42.1DR-DSN
Video SummarizationTvSumF1-score57.6DR-DSN
Video SummarizationTvSumKendall's Tau0.02DR-DSN
Video SummarizationTvSumParameters (M)2.63DR-DSN
Video SummarizationTvSumSpearman's Rho0.026DR-DSN
Video SummarizationTvSumtraining time (s)58.8DR-DSN
Video SummarizationSumMeF1-score41.4DR-DSN
Video SummarizationSumMeParameters (M)2.63DR-DSN
Video SummarizationSumMetraining time (s)19.8DR-DSN
Video SummarizationTvSumF1-score (Augmented)59.8DR-DSN
Video SummarizationTvSumF1-score (Canonical)58.1DR-DSN
Video SummarizationSumMeF1-score (Augmented)43.9DR-DSN
Video SummarizationSumMeF1-score (Canonical)42.1DR-DSN

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17