CoVR

Composed Video Retrieval

Computer VisionIntroduced 20004 papers

Description

The composed video retrieval (CoVR) task is a new task, where the goal is to find a video that matches both a query image and a query text. The query image represents a visual concept that the user is interested in, and the query text specifies how the concept should be modified or refined. For example, given an image of a fountain and the text during show at night, the CoVR task is to retrieve a video that shows the fountain at night with a show.

Papers Using This Method

From Play to Replay: Composed Video Retrieval for Temporally Fine-Grained Videos2025-06-05 VDebugger: Harnessing Execution Feedback for Debugging Visual Programs2024-06-19 Composed Video Retrieval via Enriched Context and Discriminative Embeddings2024-03-25 CoVR-2: Automatic Data Construction for Composed Video Retrieval2023-08-28