TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Leveraging Hallucinations to Reduce Manual Prompt Dependen...

Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation

Jian Hu, Jiayi Lin, Junchi Yan, Shaogang Gong

2024-08-27Camouflaged Object SegmentationSegmentationMedical Image SegmentationCamouflaged Object Segmentation with a Single Task-generic Prompt
PaperPDFCode(official)

Abstract

Promptable segmentation typically requires instance-specific manual prompts to guide the segmentation of each desired object. To minimize such a need, task-generic promptable segmentation has been introduced, which employs a single task-generic prompt to segment various images of different objects in the same task. Current methods use Multimodal Large Language Models (MLLMs) to reason detailed instance-specific prompts from a task-generic prompt for improving segmentation accuracy. The effectiveness of this segmentation heavily depends on the precision of these derived prompts. However, MLLMs often suffer hallucinations during reasoning, resulting in inaccurate prompting. While existing methods focus on eliminating hallucinations to improve a model, we argue that MLLM hallucinations can reveal valuable contextual insights when leveraged correctly, as they represent pre-trained large-scale knowledge beyond individual images. In this paper, we utilize hallucinations to mine task-related information from images and verify its accuracy for enhancing precision of the generated prompts. Specifically, we introduce an iterative Prompt-Mask Cycle generation framework (ProMaC) with a prompt generator and a mask generator.The prompt generator uses a multi-scale chain of thought prompting, initially exploring hallucinations for extracting extended contextual knowledge on a test image.These hallucinations are then reduced to formulate precise instance-specific prompts, directing the mask generator to produce masks that are consistent with task semantics by mask semantic alignment. The generated masks iteratively induce the prompt generator to focus more on task-relevant image areas and reduce irrelevant hallucinations, resulting jointly in better prompts and masks. Experiments on 5 benchmarks demonstrate the effectiveness of ProMaC. Code given in https://lwpyh.github.io/ProMaC/.

Results

TaskDatasetMetricValueModel
Object DetectionCAMOE_{\phi}0.846ProMaC
Object DetectionCAMOF_{\beta}0.725ProMaC
Object DetectionCAMOMAE0.09ProMaC
Object DetectionCAMOS_{\alpha}0.767ProMaC
Object DetectionCOD10KE_{\phi}0.876ProMaC
Object DetectionCOD10KF_{\beta}0.716ProMaC
Object DetectionCOD10KMAE0.042ProMaC
Object DetectionCOD10KS_{\alpha}0.805ProMaC
Object DetectionChameleonE_{\phi}0.899ProMaC
Object DetectionChameleonF_{\beta}0.79ProMaC
Object DetectionChameleonMAE0.044ProMaC
Object DetectionChameleonS_{\alpha}0.833ProMaC
3DCAMOE_{\phi}0.846ProMaC
3DCAMOF_{\beta}0.725ProMaC
3DCAMOMAE0.09ProMaC
3DCAMOS_{\alpha}0.767ProMaC
3DCOD10KE_{\phi}0.876ProMaC
3DCOD10KF_{\beta}0.716ProMaC
3DCOD10KMAE0.042ProMaC
3DCOD10KS_{\alpha}0.805ProMaC
3DChameleonE_{\phi}0.899ProMaC
3DChameleonF_{\beta}0.79ProMaC
3DChameleonMAE0.044ProMaC
3DChameleonS_{\alpha}0.833ProMaC
Camouflaged Object SegmentationCAMOE_{\phi}0.846ProMaC
Camouflaged Object SegmentationCAMOF_{\beta}0.725ProMaC
Camouflaged Object SegmentationCAMOMAE0.09ProMaC
Camouflaged Object SegmentationCAMOS_{\alpha}0.767ProMaC
Camouflaged Object SegmentationCOD10KE_{\phi}0.876ProMaC
Camouflaged Object SegmentationCOD10KF_{\beta}0.716ProMaC
Camouflaged Object SegmentationCOD10KMAE0.042ProMaC
Camouflaged Object SegmentationCOD10KS_{\alpha}0.805ProMaC
Camouflaged Object SegmentationChameleonE_{\phi}0.899ProMaC
Camouflaged Object SegmentationChameleonF_{\beta}0.79ProMaC
Camouflaged Object SegmentationChameleonMAE0.044ProMaC
Camouflaged Object SegmentationChameleonS_{\alpha}0.833ProMaC
Object SegmentationCAMOE_{\phi}0.846ProMaC
Object SegmentationCAMOF_{\beta}0.725ProMaC
Object SegmentationCAMOMAE0.09ProMaC
Object SegmentationCAMOS_{\alpha}0.767ProMaC
Object SegmentationCOD10KE_{\phi}0.876ProMaC
Object SegmentationCOD10KF_{\beta}0.716ProMaC
Object SegmentationCOD10KMAE0.042ProMaC
Object SegmentationCOD10KS_{\alpha}0.805ProMaC
Object SegmentationChameleonE_{\phi}0.899ProMaC
Object SegmentationChameleonF_{\beta}0.79ProMaC
Object SegmentationChameleonMAE0.044ProMaC
Object SegmentationChameleonS_{\alpha}0.833ProMaC
2D ClassificationCAMOE_{\phi}0.846ProMaC
2D ClassificationCAMOF_{\beta}0.725ProMaC
2D ClassificationCAMOMAE0.09ProMaC
2D ClassificationCAMOS_{\alpha}0.767ProMaC
2D ClassificationCOD10KE_{\phi}0.876ProMaC
2D ClassificationCOD10KF_{\beta}0.716ProMaC
2D ClassificationCOD10KMAE0.042ProMaC
2D ClassificationCOD10KS_{\alpha}0.805ProMaC
2D ClassificationChameleonE_{\phi}0.899ProMaC
2D ClassificationChameleonF_{\beta}0.79ProMaC
2D ClassificationChameleonMAE0.044ProMaC
2D ClassificationChameleonS_{\alpha}0.833ProMaC
2D Object DetectionCAMOE_{\phi}0.846ProMaC
2D Object DetectionCAMOF_{\beta}0.725ProMaC
2D Object DetectionCAMOMAE0.09ProMaC
2D Object DetectionCAMOS_{\alpha}0.767ProMaC
2D Object DetectionCOD10KE_{\phi}0.876ProMaC
2D Object DetectionCOD10KF_{\beta}0.716ProMaC
2D Object DetectionCOD10KMAE0.042ProMaC
2D Object DetectionCOD10KS_{\alpha}0.805ProMaC
2D Object DetectionChameleonE_{\phi}0.899ProMaC
2D Object DetectionChameleonF_{\beta}0.79ProMaC
2D Object DetectionChameleonMAE0.044ProMaC
2D Object DetectionChameleonS_{\alpha}0.833ProMaC
16kCAMOE_{\phi}0.846ProMaC
16kCAMOF_{\beta}0.725ProMaC
16kCAMOMAE0.09ProMaC
16kCAMOS_{\alpha}0.767ProMaC
16kCOD10KE_{\phi}0.876ProMaC
16kCOD10KF_{\beta}0.716ProMaC
16kCOD10KMAE0.042ProMaC
16kCOD10KS_{\alpha}0.805ProMaC
16kChameleonE_{\phi}0.899ProMaC
16kChameleonF_{\beta}0.79ProMaC
16kChameleonMAE0.044ProMaC
16kChameleonS_{\alpha}0.833ProMaC

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17