TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ObjectFormer for Image Manipulation Detection and Localiza...

ObjectFormer for Image Manipulation Detection and Localization

Junke Wang, Zuxuan Wu, Jingjing Chen, Xintong Han, Abhinav Shrivastava, Ser-Nam Lim, Yu-Gang Jiang

2022-03-28CVPR 2022 1Image Manipulation LocalizationImage ManipulationImage Manipulation Detection
PaperPDF

Abstract

Recent advances in image editing techniques have posed serious challenges to the trustworthiness of multimedia data, which drives the research of image tampering detection. In this paper, we propose ObjectFormer to detect and localize image manipulations. To capture subtle manipulation traces that are no longer visible in the RGB domain, we extract high-frequency features of the images and combine them with RGB features as multimodal patch embeddings. Additionally, we use a set of learnable object prototypes as mid-level representations to model the object-level consistencies among different regions, which are further used to refine patch embeddings to capture the patch-level consistencies. We conduct extensive experiments on various datasets and the results verify the effectiveness of the proposed method, outperforming state-of-the-art tampering detection and localization methods.

Results

TaskDatasetMetricValueModel
Image Manipulation LocalizationColumbia(Protocol-CAT)Pixel Binary F10.732ObjectFormer
Image Manipulation LocalizationNIST16(Protocol-CAT)Pixel Binary F10.252ObjectFormer
Image Manipulation LocalizationCASIAv1(Protoclo-CAT)Pixel Binary F10.531ObjectFormer
Image Manipulation LocalizationCOVERAGE(Protocol-CAT)Pixel Binary F10.257ObjectFormer

Related Papers

Beyond Fully Supervised Pixel Annotations: Scribble-Driven Weakly-Supervised Framework for Image Manipulation Localization2025-07-17Towards Reliable Identification of Diffusion-based Image Manipulations2025-06-05UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation2025-06-03Weakly-supervised Localization of Manipulated Image Regions Using Multi-resolution Learned Features2025-05-29RBench-V: A Primary Assessment for Visual Reasoning Models with Multi-modal Outputs2025-05-22My Face Is Mine, Not Yours: Facial Protection Against Diffusion Model Face Swapping2025-05-21Visual Agentic Reinforcement Fine-Tuning2025-05-20Emerging Properties in Unified Multimodal Pretraining2025-05-20