YuXuan Li, Xiang Li, Yimian Dai, Qibin Hou, Li Liu, Yongxiang Liu, Ming-Ming Cheng, Jian Yang
Remote sensing images pose distinct challenges for downstream tasks due to their inherent complexity. While a considerable amount of research has been dedicated to remote sensing classification, object detection and semantic segmentation, most of these studies have overlooked the valuable prior knowledge embedded within remote sensing scenarios. Such prior knowledge can be useful because remote sensing objects may be mistakenly recognized without referencing a sufficiently long-range context, which can vary for different objects. This paper considers these priors and proposes a lightweight Large Selective Kernel Network (LSKNet) backbone. LSKNet can dynamically adjust its large spatial receptive field to better model the ranging context of various objects in remote sensing scenarios. To our knowledge, large and selective kernel mechanisms have not been previously explored in remote sensing images. Without bells and whistles, our lightweight LSKNet sets new state-of-the-art scores on standard remote sensing classification, object detection and semantic segmentation benchmarks. Our comprehensive analysis further validated the significance of the identified priors and the effectiveness of LSKNet. The code is available at https://github.com/zcablii/LSKNet.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | ISPRS Vaihingen | Average F1 | 91.8 | LSKNet-S |
| Semantic Segmentation | ISPRS Vaihingen | Category mIoU | 85.1 | LSKNet-S |
| Semantic Segmentation | ISPRS Vaihingen | Overall Accuracy | 93.6 | LSKNet-S |
| Semantic Segmentation | ISPRS Vaihingen | Average F1 | 91.7 | LSKNet-T |
| Semantic Segmentation | ISPRS Vaihingen | Category mIoU | 84.9 | LSKNet-T |
| Semantic Segmentation | ISPRS Vaihingen | Overall Accuracy | 93.6 | LSKNet-T |
| Semantic Segmentation | ISPRS Potsdam | Mean F1 | 93.1 | LSKNet-S |
| Semantic Segmentation | ISPRS Potsdam | Mean IoU | 87.2 | LSKNet-S |
| Semantic Segmentation | ISPRS Potsdam | Overall Accuracy | 92 | LSKNet-S |
| Semantic Segmentation | UAVid | Mean IoU | 70 | LSKNet-S |
| Semantic Segmentation | UAVid | Mean IoU | 69.3 | LSKNet-T |
| Change Detection | LEVIR-CD | F1 | 92.27 | LSKNet |
| Change Detection | LEVIR-CD | F1-score | 92.27 | LSKNet |
| Change Detection | LEVIR-CD | IoU | 85.65 | LSKNet |
| Change Detection | LEVIR-CD | Precision | 93.34 | LSKNet |
| Change Detection | LEVIR-CD | Recall | 91.23 | LSKNet |
| Change Detection | S2Looking | F1-Score | 67.52 | LSKNet-S |
| Change Detection | S2Looking | IoU | 50.96 | LSKNet-S |
| Change Detection | S2Looking | Precision | 71.9 | LSKNet-S |
| Change Detection | S2Looking | Recall | 63.64 | LSKNet-S |
| 10-shot image generation | ISPRS Vaihingen | Average F1 | 91.8 | LSKNet-S |
| 10-shot image generation | ISPRS Vaihingen | Category mIoU | 85.1 | LSKNet-S |
| 10-shot image generation | ISPRS Vaihingen | Overall Accuracy | 93.6 | LSKNet-S |
| 10-shot image generation | ISPRS Vaihingen | Average F1 | 91.7 | LSKNet-T |
| 10-shot image generation | ISPRS Vaihingen | Category mIoU | 84.9 | LSKNet-T |
| 10-shot image generation | ISPRS Vaihingen | Overall Accuracy | 93.6 | LSKNet-T |
| 10-shot image generation | ISPRS Potsdam | Mean F1 | 93.1 | LSKNet-S |
| 10-shot image generation | ISPRS Potsdam | Mean IoU | 87.2 | LSKNet-S |
| 10-shot image generation | ISPRS Potsdam | Overall Accuracy | 92 | LSKNet-S |
| 10-shot image generation | UAVid | Mean IoU | 70 | LSKNet-S |
| 10-shot image generation | UAVid | Mean IoU | 69.3 | LSKNet-T |