TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning from Video and Text via Large-Scale Discriminativ...

Learning from Video and Text via Large-Scale Discriminative Clustering

Antoine Miech, Jean-Baptiste Alayrac, Piotr Bojanowski, Ivan Laptev, Josef Sivic

2017-07-27ICCV 2017 10Video RetrievalClusteringVideo AlignmentAction RecognitionTemporal Action Localization
PaperPDFCodeCode

Abstract

Discriminative clustering has been successfully applied to a number of weakly-supervised learning tasks. Such applications include person and action recognition, text-to-video alignment, object co-segmentation and colocalization in videos and images. One drawback of discriminative clustering, however, is its limited scalability. We address this issue and propose an online optimization algorithm based on the Block-Coordinate Frank-Wolfe algorithm. We apply the proposed method to the problem of weakly supervised learning of actions and actors from movies together with corresponding movie scripts. The scaling up of the learning problem to 66 feature length movies enables us to significantly improve weakly supervised action recognition.

Results

TaskDatasetMetricValueModel
VideoLSMDCtext-to-video Median Rank52Large-Scale Discriminative Clustering
VideoLSMDCtext-to-video R@17.3Large-Scale Discriminative Clustering
VideoLSMDCtext-to-video R@1027.1Large-Scale Discriminative Clustering
VideoLSMDCtext-to-video R@519.2Large-Scale Discriminative Clustering
Video RetrievalLSMDCtext-to-video Median Rank52Large-Scale Discriminative Clustering
Video RetrievalLSMDCtext-to-video R@17.3Large-Scale Discriminative Clustering
Video RetrievalLSMDCtext-to-video R@1027.1Large-Scale Discriminative Clustering
Video RetrievalLSMDCtext-to-video R@519.2Large-Scale Discriminative Clustering

Related Papers

Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Ranking Vectors Clustering: Theory and Applications2025-07-16DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework2025-07-11GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning2025-07-09Consistency and Inconsistency in $K$-Means Clustering2025-07-08MC-INR: Efficient Encoding of Multivariate Scientific Simulation Data using Meta-Learning and Clustered Implicit Neural Representations2025-07-03