TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Location-Sensitive Visual Recognition with Cross-IOU Loss

Location-Sensitive Visual Recognition with Cross-IOU Loss

Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

2021-04-112D Human Pose EstimationSemantic SegmentationPose EstimationInstance Segmentationobject-detectionObject Detection
PaperPDFCode(official)

Abstract

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks. This paper summarizes these tasks as location-sensitive visual recognition and proposes a unified solution named location-sensitive network (LSNet). Based on a deep neural network as the backbone, LSNet predicts an anchor point and a set of landmarks which together define the shape of the target object. The key to optimizing the LSNet lies in the ability of fitting various scales, for which we design a novel loss function named cross-IOU loss that computes the cross-IOU of each anchor point-landmark pair to approximate the global IOU between the prediction and ground-truth. The flexibly located and accurately predicted landmarks also enable LSNet to incorporate richer contextual information for visual recognition. Evaluated on the MS-COCO dataset, LSNet set the new state-of-the-art accuracy for anchor-free object detection (a 53.5% box AP) and instance segmentation (a 40.2% mask AP), and shows promising performance in detecting multi-scale human poses. Code is available at https://github.com/Duankaiwen/LSNet

Results

TaskDatasetMetricValueModel
Object DetectionCOCO test-devAP5071.1LSNet (Res2Net-101+ DCN, multi-scale)
Object DetectionCOCO test-devAP7559.2LSNet (Res2Net-101+ DCN, multi-scale)
Object DetectionCOCO test-devAPL65.8LSNet (Res2Net-101+ DCN, multi-scale)
Object DetectionCOCO test-devAPM56.4LSNet (Res2Net-101+ DCN, multi-scale)
Object DetectionCOCO test-devAPS35.2LSNet (Res2Net-101+ DCN, multi-scale)
Object DetectionCOCO test-devbox mAP53.5LSNet (Res2Net-101+ DCN, multi-scale)
3DCOCO test-devAP5071.1LSNet (Res2Net-101+ DCN, multi-scale)
3DCOCO test-devAP7559.2LSNet (Res2Net-101+ DCN, multi-scale)
3DCOCO test-devAPL65.8LSNet (Res2Net-101+ DCN, multi-scale)
3DCOCO test-devAPM56.4LSNet (Res2Net-101+ DCN, multi-scale)
3DCOCO test-devAPS35.2LSNet (Res2Net-101+ DCN, multi-scale)
3DCOCO test-devbox mAP53.5LSNet (Res2Net-101+ DCN, multi-scale)
2D ClassificationCOCO test-devAP5071.1LSNet (Res2Net-101+ DCN, multi-scale)
2D ClassificationCOCO test-devAP7559.2LSNet (Res2Net-101+ DCN, multi-scale)
2D ClassificationCOCO test-devAPL65.8LSNet (Res2Net-101+ DCN, multi-scale)
2D ClassificationCOCO test-devAPM56.4LSNet (Res2Net-101+ DCN, multi-scale)
2D ClassificationCOCO test-devAPS35.2LSNet (Res2Net-101+ DCN, multi-scale)
2D ClassificationCOCO test-devbox mAP53.5LSNet (Res2Net-101+ DCN, multi-scale)
2D Object DetectionCOCO test-devAP5071.1LSNet (Res2Net-101+ DCN, multi-scale)
2D Object DetectionCOCO test-devAP7559.2LSNet (Res2Net-101+ DCN, multi-scale)
2D Object DetectionCOCO test-devAPL65.8LSNet (Res2Net-101+ DCN, multi-scale)
2D Object DetectionCOCO test-devAPM56.4LSNet (Res2Net-101+ DCN, multi-scale)
2D Object DetectionCOCO test-devAPS35.2LSNet (Res2Net-101+ DCN, multi-scale)
2D Object DetectionCOCO test-devbox mAP53.5LSNet (Res2Net-101+ DCN, multi-scale)
16kCOCO test-devAP5071.1LSNet (Res2Net-101+ DCN, multi-scale)
16kCOCO test-devAP7559.2LSNet (Res2Net-101+ DCN, multi-scale)
16kCOCO test-devAPL65.8LSNet (Res2Net-101+ DCN, multi-scale)
16kCOCO test-devAPM56.4LSNet (Res2Net-101+ DCN, multi-scale)
16kCOCO test-devAPS35.2LSNet (Res2Net-101+ DCN, multi-scale)
16kCOCO test-devbox mAP53.5LSNet (Res2Net-101+ DCN, multi-scale)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17