Revisiting the Encoding of Satellite Image Time Series

Xin Cai, Yaxin Bi, Peter Nicholl, Roy Sterritt

2023-05-03Panoptic Segmentation Representation Learning Segmentation Semantic Segmentation Time Series object-detection Object Detection Image Segmentation

Paper PDF Code(official)

Abstract

Satellite Image Time Series (SITS) representation learning is complex due to high spatiotemporal resolutions, irregular acquisition times, and intricate spatiotemporal interactions. These challenges result in specialized neural network architectures tailored for SITS analysis. The field has witnessed promising results achieved by pioneering researchers, but transferring the latest advances or established paradigms from Computer Vision (CV) to SITS is still highly challenging due to the existing suboptimal representation learning framework. In this paper, we develop a novel perspective of SITS processing as a direct set prediction problem, inspired by the recent trend in adopting query-based transformer decoders to streamline the object detection or image segmentation pipeline. We further propose to decompose the representation learning process of SITS into three explicit steps: collect-update-distribute, which is computationally efficient and suits for irregularly-sampled and asynchronous temporal satellite observations. Facilitated by the unique reformulation, our proposed temporal learning backbone of SITS, initially pre-trained on the resource efficient pixel-set format and then fine-tuned on the downstream dense prediction tasks, has attained new state-of-the-art (SOTA) results on the PASTIS benchmark dataset. Specifically, the clear separation between temporal and spatial components in the semantic/panoptic segmentation pipeline of SITS makes us leverage the latest advances in CV, such as the universal image segmentation architecture, resulting in a noticeable 2.5 points increase in mIoU and 8.8 points increase in PQ, respectively, compared to the best scores reported so far.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	PASTIS	Mean IoU (test)	67.9	Exchanger+Mask2Former
Semantic Segmentation	PASTIS	Mean IoU (test)	66.8	Exchanger+Unet
Semantic Segmentation	PASTIS	PQ	52.6	Exchanger+Mask2Former
Semantic Segmentation	PASTIS	RQ	61.6	Exchanger+Mask2Former
Semantic Segmentation	PASTIS	SQ	84.6	Exchanger+Mask2Former
Semantic Segmentation	PASTIS	PQ	47.8	Exchanger+Unet+PaPs
Semantic Segmentation	PASTIS	RQ	58.9	Exchanger+Unet+PaPs
Semantic Segmentation	PASTIS	SQ	80.3	Exchanger+Unet+PaPs
10-shot image generation	PASTIS	Mean IoU (test)	67.9	Exchanger+Mask2Former
10-shot image generation	PASTIS	Mean IoU (test)	66.8	Exchanger+Unet
10-shot image generation	PASTIS	PQ	52.6	Exchanger+Mask2Former
10-shot image generation	PASTIS	RQ	61.6	Exchanger+Mask2Former
10-shot image generation	PASTIS	SQ	84.6	Exchanger+Mask2Former
10-shot image generation	PASTIS	PQ	47.8	Exchanger+Unet+PaPs
10-shot image generation	PASTIS	RQ	58.9	Exchanger+Unet+PaPs
10-shot image generation	PASTIS	SQ	80.3	Exchanger+Unet+PaPs
Panoptic Segmentation	PASTIS	PQ	52.6	Exchanger+Mask2Former
Panoptic Segmentation	PASTIS	RQ	61.6	Exchanger+Mask2Former
Panoptic Segmentation	PASTIS	SQ	84.6	Exchanger+Mask2Former
Panoptic Segmentation	PASTIS	PQ	47.8	Exchanger+Unet+PaPs
Panoptic Segmentation	PASTIS	RQ	58.9	Exchanger+Unet+PaPs
Panoptic Segmentation	PASTIS	SQ	80.3	Exchanger+Unet+PaPs

Revisiting the Encoding of Satellite Image Time Series

Abstract

Results

Related Papers

Revisiting the Encoding of Satellite Image Time Series

Abstract

Results

Related Papers