TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Tracking Anything in High Quality

Tracking Anything in High Quality

Jiawen Zhu, Zhenyu Chen, Zeqi Hao, Shijie Chang, Lu Zhang, Dong Wang, Huchuan Lu, Bin Luo, Jun-Yan He, Jin-Peng Lan, Hanyuan Chen, Chenyang Li

2023-07-26Visual Object TrackingSemi-Supervised Video Object SegmentationSemantic SegmentationVideo Object SegmentationObject TrackingVideo Semantic Segmentation
PaperPDFCode(official)

Abstract

Visual object tracking is a fundamental video task in computer vision. Recently, the notably increasing power of perception algorithms allows the unification of single/multiobject and box/mask-based tracking. Among them, the Segment Anything Model (SAM) attracts much attention. In this report, we propose HQTrack, a framework for High Quality Tracking anything in videos. HQTrack mainly consists of a video multi-object segmenter (VMOS) and a mask refiner (MR). Given the object to be tracked in the initial frame of a video, VMOS propagates the object masks to the current frame. The mask results at this stage are not accurate enough since VMOS is trained on several closeset video object segmentation (VOS) datasets, which has limited ability to generalize to complex and corner scenes. To further improve the quality of tracking masks, a pretrained MR model is employed to refine the tracking results. As a compelling testament to the effectiveness of our paradigm, without employing any tricks such as test-time data augmentations and model ensemble, HQTrack ranks the 2nd place in the Visual Object Tracking and Segmentation (VOTS2023) challenge. Code and models are available at https://github.com/jiawen-zhu/HQTrack.

Results

TaskDatasetMetricValueModel
VideoYouTube-VOS 2019F-Measure (Seen)89.9DEVA
VideoYouTube-VOS 2019F-Measure (Unseen)89.1DEVA
VideoYouTube-VOS 2019Jaccard (Seen)85.4DEVA
VideoYouTube-VOS 2019Jaccard (Unseen)89.9DEVA
VideoYouTube-VOS 2019Overall86.2DEVA
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.9DEVA
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.1DEVA
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)85.4DEVA
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)89.9DEVA
Video Object SegmentationYouTube-VOS 2019Overall86.2DEVA
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.9DEVA
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.1DEVA
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)85.4DEVA
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)89.9DEVA
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall86.2DEVA

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Association2025-07-16