TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/RPT: Learning Point Set Representation for Siamese Visual ...

RPT: Learning Point Set Representation for Siamese Visual Tracking

Ziang Ma, Linyuan Wang, HaiTao Zhang, Wei Lu, Jun Yin

2020-08-08Semi-Supervised Video Object SegmentationVisual Tracking
PaperPDF

Abstract

While remarkable progress has been made in robust visual tracking, accurate target state estimation still remains a highly challenging problem. In this paper, we argue that this issue is closely related to the prevalent bounding box representation, which provides only a coarse spatial extent of object. Thus an effcient visual tracking framework is proposed to accurately estimate the target state with a finer representation as a set of representative points. The point set is trained to indicate the semantically and geometrically significant positions of target region, enabling more fine-grained localization and modeling of object appearance. We further propose a multi-level aggregation strategy to obtain detailed structure information by fusing hierarchical convolution layers. Extensive experiments on several challenging benchmarks including OTB2015, VOT2018, VOT2019 and GOT-10k demonstrate that our method achieves new state-of-the-art performance while running at over 20 FPS.

Results

TaskDatasetMetricValueModel
VideoVOT2020EAO0.53RPT
VideoVOT2020EAO (real-time)0.29RPT
Video Object SegmentationVOT2020EAO0.53RPT
Video Object SegmentationVOT2020EAO (real-time)0.29RPT
Semi-Supervised Video Object SegmentationVOT2020EAO0.53RPT
Semi-Supervised Video Object SegmentationVOT2020EAO (real-time)0.29RPT

Related Papers

What You Have is What You Track: Adaptive and Robust Multimodal Tracking2025-07-08R1-Track: Direct Application of MLLMs to Visual Object Tracking via Reinforcement Learning2025-06-27Exploiting Lightweight Hierarchical ViT and Dynamic Framework for Efficient Visual Tracking2025-06-25Comparison of Two Methods for Stationary Incident Detection Based on Background Image2025-06-17THU-Warwick Submission for EPIC-KITCHEN Challenge 2025: Semi-Supervised Video Object Segmentation2025-06-07Towards Effective and Efficient Adversarial Defense with Diffusion Models for Robust Visual Tracking2025-05-31TrackVLA: Embodied Visual Tracking in the Wild2025-05-29CLDTracker: A Comprehensive Language Description for Visual Tracking2025-05-29