TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Implicit Motion Handling for Video Camouflaged Object Dete...

Implicit Motion Handling for Video Camouflaged Object Detection

Xuelian Cheng, Huan Xiong, Deng-Ping Fan, Yiran Zhong, Mehrtash Harandi, Tom Drummond, ZongYuan Ge

2022-03-14CVPR 2022 1Motion EstimationCamouflaged Object SegmentationSegmentationSemantic Segmentationobject-detectionObject Detection
PaperPDFCode

Abstract

We propose a new video camouflaged object detection (VCOD) framework that can exploit both short-term dynamics and long-term temporal consistency to detect camouflaged objects from video frames. An essential property of camouflaged objects is that they usually exhibit patterns similar to the background and thus make them hard to identify from still images. Therefore, effectively handling temporal dynamics in videos becomes the key for the VCOD task as the camouflaged objects will be noticeable when they move. However, current VCOD methods often leverage homography or optical flows to represent motions, where the detection error may accumulate from both the motion estimation error and the segmentation error. On the other hand, our method unifies motion estimation and object segmentation within a single optimization framework. Specifically, we build a dense correlation volume to implicitly capture motions between neighbouring frames and utilize the final segmentation supervision to optimize the implicit motion estimation and segmentation jointly. Furthermore, to enforce temporal consistency within a video sequence, we jointly utilize a spatio-temporal transformer to refine the short-term predictions. Extensive experiments on VCOD benchmarks demonstrate the architectural effectiveness of our approach. We also provide a large-scale VCOD dataset named MoCA-Mask with pixel-level handcrafted ground-truth masks and construct a comprehensive VCOD benchmark with previous methods to facilitate research in this direction. Dataset Link: https://xueliancheng.github.io/SLT-Net-project.

Results

TaskDatasetMetricValueModel
Object DetectionMoCA-MaskMAE0.027STL-Net-LT-PVTv2-B5
Object DetectionMoCA-MaskS-measure0.631STL-Net-LT-PVTv2-B5
Object DetectionMoCA-MaskmDice0.36STL-Net-LT-PVTv2-B5
Object DetectionMoCA-MaskmIoU0.272STL-Net-LT-PVTv2-B5
Object DetectionMoCA-Maskweighted F-measure0.311STL-Net-LT-PVTv2-B5
Object DetectionCamouflaged Animal DatasetMAE0.03STL-Net-LT-PVTv2-B5
Object DetectionCamouflaged Animal DatasetS-measure0.696STL-Net-LT-PVTv2-B5
Object DetectionCamouflaged Animal DatasetmDice0.493STL-Net-LT-PVTv2-B5
Object DetectionCamouflaged Animal DatasetmIoU0.402STL-Net-LT-PVTv2-B5
Object DetectionCamouflaged Animal Datasetweighted F-measure0.481STL-Net-LT-PVTv2-B5
3DMoCA-MaskMAE0.027STL-Net-LT-PVTv2-B5
3DMoCA-MaskS-measure0.631STL-Net-LT-PVTv2-B5
3DMoCA-MaskmDice0.36STL-Net-LT-PVTv2-B5
3DMoCA-MaskmIoU0.272STL-Net-LT-PVTv2-B5
3DMoCA-Maskweighted F-measure0.311STL-Net-LT-PVTv2-B5
3DCamouflaged Animal DatasetMAE0.03STL-Net-LT-PVTv2-B5
3DCamouflaged Animal DatasetS-measure0.696STL-Net-LT-PVTv2-B5
3DCamouflaged Animal DatasetmDice0.493STL-Net-LT-PVTv2-B5
3DCamouflaged Animal DatasetmIoU0.402STL-Net-LT-PVTv2-B5
3DCamouflaged Animal Datasetweighted F-measure0.481STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationMoCA-MaskMAE0.027STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationMoCA-MaskS-measure0.631STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationMoCA-MaskmDice0.36STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationMoCA-MaskmIoU0.272STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationMoCA-Maskweighted F-measure0.311STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationCamouflaged Animal DatasetMAE0.03STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationCamouflaged Animal DatasetS-measure0.696STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationCamouflaged Animal DatasetmDice0.493STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationCamouflaged Animal DatasetmIoU0.402STL-Net-LT-PVTv2-B5
Camouflaged Object SegmentationCamouflaged Animal Datasetweighted F-measure0.481STL-Net-LT-PVTv2-B5
Object SegmentationMoCA-MaskMAE0.027STL-Net-LT-PVTv2-B5
Object SegmentationMoCA-MaskS-measure0.631STL-Net-LT-PVTv2-B5
Object SegmentationMoCA-MaskmDice0.36STL-Net-LT-PVTv2-B5
Object SegmentationMoCA-MaskmIoU0.272STL-Net-LT-PVTv2-B5
Object SegmentationMoCA-Maskweighted F-measure0.311STL-Net-LT-PVTv2-B5
Object SegmentationCamouflaged Animal DatasetMAE0.03STL-Net-LT-PVTv2-B5
Object SegmentationCamouflaged Animal DatasetS-measure0.696STL-Net-LT-PVTv2-B5
Object SegmentationCamouflaged Animal DatasetmDice0.493STL-Net-LT-PVTv2-B5
Object SegmentationCamouflaged Animal DatasetmIoU0.402STL-Net-LT-PVTv2-B5
Object SegmentationCamouflaged Animal Datasetweighted F-measure0.481STL-Net-LT-PVTv2-B5
2D ClassificationMoCA-MaskMAE0.027STL-Net-LT-PVTv2-B5
2D ClassificationMoCA-MaskS-measure0.631STL-Net-LT-PVTv2-B5
2D ClassificationMoCA-MaskmDice0.36STL-Net-LT-PVTv2-B5
2D ClassificationMoCA-MaskmIoU0.272STL-Net-LT-PVTv2-B5
2D ClassificationMoCA-Maskweighted F-measure0.311STL-Net-LT-PVTv2-B5
2D ClassificationCamouflaged Animal DatasetMAE0.03STL-Net-LT-PVTv2-B5
2D ClassificationCamouflaged Animal DatasetS-measure0.696STL-Net-LT-PVTv2-B5
2D ClassificationCamouflaged Animal DatasetmDice0.493STL-Net-LT-PVTv2-B5
2D ClassificationCamouflaged Animal DatasetmIoU0.402STL-Net-LT-PVTv2-B5
2D ClassificationCamouflaged Animal Datasetweighted F-measure0.481STL-Net-LT-PVTv2-B5
2D Object DetectionMoCA-MaskMAE0.027STL-Net-LT-PVTv2-B5
2D Object DetectionMoCA-MaskS-measure0.631STL-Net-LT-PVTv2-B5
2D Object DetectionMoCA-MaskmDice0.36STL-Net-LT-PVTv2-B5
2D Object DetectionMoCA-MaskmIoU0.272STL-Net-LT-PVTv2-B5
2D Object DetectionMoCA-Maskweighted F-measure0.311STL-Net-LT-PVTv2-B5
2D Object DetectionCamouflaged Animal DatasetMAE0.03STL-Net-LT-PVTv2-B5
2D Object DetectionCamouflaged Animal DatasetS-measure0.696STL-Net-LT-PVTv2-B5
2D Object DetectionCamouflaged Animal DatasetmDice0.493STL-Net-LT-PVTv2-B5
2D Object DetectionCamouflaged Animal DatasetmIoU0.402STL-Net-LT-PVTv2-B5
2D Object DetectionCamouflaged Animal Datasetweighted F-measure0.481STL-Net-LT-PVTv2-B5
16kMoCA-MaskMAE0.027STL-Net-LT-PVTv2-B5
16kMoCA-MaskS-measure0.631STL-Net-LT-PVTv2-B5
16kMoCA-MaskmDice0.36STL-Net-LT-PVTv2-B5
16kMoCA-MaskmIoU0.272STL-Net-LT-PVTv2-B5
16kMoCA-Maskweighted F-measure0.311STL-Net-LT-PVTv2-B5
16kCamouflaged Animal DatasetMAE0.03STL-Net-LT-PVTv2-B5
16kCamouflaged Animal DatasetS-measure0.696STL-Net-LT-PVTv2-B5
16kCamouflaged Animal DatasetmDice0.493STL-Net-LT-PVTv2-B5
16kCamouflaged Animal DatasetmIoU0.402STL-Net-LT-PVTv2-B5
16kCamouflaged Animal Datasetweighted F-measure0.481STL-Net-LT-PVTv2-B5

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17