TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/FreDSNet: Joint Monocular Depth and Semantic Segmentation ...

FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions

Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerrero

2022-10-04Scene UnderstandingSegmentationSemantic SegmentationDepth EstimationMonocular Depth Estimation
PaperPDFCode(official)

Abstract

In this work we present FreDSNet, a deep learning solution which obtains semantic 3D understanding of indoor environments from single panoramas. Omnidirectional images reveal task-specific advantages when addressing scene understanding problems due to the 360-degree contextual information about the entire environment they provide. However, the inherent characteristics of the omnidirectional images add additional problems to obtain an accurate detection and segmentation of objects or a good depth estimation. To overcome these problems, we exploit convolutions in the frequential domain obtaining a wider receptive field in each convolutional layer. These convolutions allow to leverage the whole context information from omnidirectional images. FreDSNet is the first network that jointly provides monocular depth estimation and semantic segmentation from a single panoramic image exploiting fast Fourier convolutions. Our experiments show that FreDSNet has similar performance as specific state of the art methods for semantic segmentation and depth estimation. FreDSNet code is publicly available in https://github.com/Sbrunoberenguel/FreDSNet

Results

TaskDatasetMetricValueModel
Depth EstimationStanford2D3D PanoramicRMSE0.2727FreDSNet
Depth EstimationStanford2D3D Panoramicabsolute relative error0.0952FreDSNet
Semantic SegmentationStanford2D3D PanoramicmAcc63.1FreDSNet
3DStanford2D3D PanoramicRMSE0.2727FreDSNet
3DStanford2D3D Panoramicabsolute relative error0.0952FreDSNet
10-shot image generationStanford2D3D PanoramicmAcc63.1FreDSNet

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17