Location-Sensitive Visual Recognition with Cross-IOU Loss

Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

2021-04-112D Human Pose Estimation Semantic Segmentation Pose Estimation Instance Segmentation object-detection Object Detection

Paper PDF Code(official)

Abstract

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks. This paper summarizes these tasks as location-sensitive visual recognition and proposes a unified solution named location-sensitive network (LSNet). Based on a deep neural network as the backbone, LSNet predicts an anchor point and a set of landmarks which together define the shape of the target object. The key to optimizing the LSNet lies in the ability of fitting various scales, for which we design a novel loss function named cross-IOU loss that computes the cross-IOU of each anchor point-landmark pair to approximate the global IOU between the prediction and ground-truth. The flexibly located and accurately predicted landmarks also enable LSNet to incorporate richer contextual information for visual recognition. Evaluated on the MS-COCO dataset, LSNet set the new state-of-the-art accuracy for anchor-free object detection (a 53.5% box AP) and instance segmentation (a 40.2% mask AP), and shows promising performance in detecting multi-scale human poses. Code is available at https://github.com/Duankaiwen/LSNet

Results

Task	Dataset	Metric	Value	Model
Object Detection	COCO test-dev	AP50	71.1	LSNet (Res2Net-101+ DCN, multi-scale)
Object Detection	COCO test-dev	AP75	59.2	LSNet (Res2Net-101+ DCN, multi-scale)
Object Detection	COCO test-dev	APL	65.8	LSNet (Res2Net-101+ DCN, multi-scale)
Object Detection	COCO test-dev	APM	56.4	LSNet (Res2Net-101+ DCN, multi-scale)
Object Detection	COCO test-dev	APS	35.2	LSNet (Res2Net-101+ DCN, multi-scale)
Object Detection	COCO test-dev	box mAP	53.5	LSNet (Res2Net-101+ DCN, multi-scale)
3D	COCO test-dev	AP50	71.1	LSNet (Res2Net-101+ DCN, multi-scale)
3D	COCO test-dev	AP75	59.2	LSNet (Res2Net-101+ DCN, multi-scale)
3D	COCO test-dev	APL	65.8	LSNet (Res2Net-101+ DCN, multi-scale)
3D	COCO test-dev	APM	56.4	LSNet (Res2Net-101+ DCN, multi-scale)
3D	COCO test-dev	APS	35.2	LSNet (Res2Net-101+ DCN, multi-scale)
3D	COCO test-dev	box mAP	53.5	LSNet (Res2Net-101+ DCN, multi-scale)
2D Classification	COCO test-dev	AP50	71.1	LSNet (Res2Net-101+ DCN, multi-scale)
2D Classification	COCO test-dev	AP75	59.2	LSNet (Res2Net-101+ DCN, multi-scale)
2D Classification	COCO test-dev	APL	65.8	LSNet (Res2Net-101+ DCN, multi-scale)
2D Classification	COCO test-dev	APM	56.4	LSNet (Res2Net-101+ DCN, multi-scale)
2D Classification	COCO test-dev	APS	35.2	LSNet (Res2Net-101+ DCN, multi-scale)
2D Classification	COCO test-dev	box mAP	53.5	LSNet (Res2Net-101+ DCN, multi-scale)
2D Object Detection	COCO test-dev	AP50	71.1	LSNet (Res2Net-101+ DCN, multi-scale)
2D Object Detection	COCO test-dev	AP75	59.2	LSNet (Res2Net-101+ DCN, multi-scale)
2D Object Detection	COCO test-dev	APL	65.8	LSNet (Res2Net-101+ DCN, multi-scale)
2D Object Detection	COCO test-dev	APM	56.4	LSNet (Res2Net-101+ DCN, multi-scale)
2D Object Detection	COCO test-dev	APS	35.2	LSNet (Res2Net-101+ DCN, multi-scale)
2D Object Detection	COCO test-dev	box mAP	53.5	LSNet (Res2Net-101+ DCN, multi-scale)
16k	COCO test-dev	AP50	71.1	LSNet (Res2Net-101+ DCN, multi-scale)
16k	COCO test-dev	AP75	59.2	LSNet (Res2Net-101+ DCN, multi-scale)
16k	COCO test-dev	APL	65.8	LSNet (Res2Net-101+ DCN, multi-scale)
16k	COCO test-dev	APM	56.4	LSNet (Res2Net-101+ DCN, multi-scale)
16k	COCO test-dev	APS	35.2	LSNet (Res2Net-101+ DCN, multi-scale)
16k	COCO test-dev	box mAP	53.5	LSNet (Res2Net-101+ DCN, multi-scale)

Location-Sensitive Visual Recognition with Cross-IOU Loss

Abstract

Results

Related Papers

Location-Sensitive Visual Recognition with Cross-IOU Loss

Abstract

Results

Related Papers