FCoT

Foreground Chain-of-Thought

ImagesTextsIntroduced 2025-06-27

FCoT (Chain‑of‑Thought Segmentation) is replicate the step-by-step reasoning process a human annotator follows when using SAM2 to generate masks. Each example pairs an image with:

  • A bounding box locating the target object,

  • A sequence of foreground/background point prompts for refining the mask,

  • Natural language explanations (chain‑of‑thought) generated by Gemini‑2.5‑Pro summarizing the annotation process.