TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unsupervised domain adaptation for clinician pose estimati...

Unsupervised domain adaptation for clinician pose estimation and instance segmentation in the operating room

Vinkle Srivastav, Afshin Gangi, Nicolas Padoy

2021-08-262D Human Pose EstimationInstance SegmentationPrivacy PreservingUnsupervised Domain AdaptationSemi-Supervised Human Pose Estimation
PaperPDFCode(official)

Abstract

The fine-grained localization of clinicians in the operating room (OR) is a key component to design the new generation of OR support systems. Computer vision models for person pixel-based segmentation and body-keypoints detection are needed to better understand the clinical activities and the spatial layout of the OR. This is challenging, not only because OR images are very different from traditional vision datasets, but also because data and annotations are hard to collect and generate in the OR due to privacy concerns. To address these concerns, we first study how joint person pose estimation and instance segmentation can be performed on low resolutions images with downsampling factors from 1x to 12x. Second, to address the domain shift and the lack of annotations, we propose a novel unsupervised domain adaptation method, called AdaptOR, to adapt a model from an in-the-wild labeled source domain to a statistically different unlabeled target domain. We propose to exploit explicit geometric constraints on the different augmentations of the unlabeled target domain image to generate accurate pseudo labels and use these pseudo labels to train the model on high- and low-resolution OR images in a self-training framework. Furthermore, we propose disentangled feature normalization to handle the statistically different source and target domain data. Extensive experimental results with detailed ablation studies on the two OR datasets MVOR+ and TUM-OR-test show the effectiveness of our approach against strongly constructed baselines, especially on the low-resolution privacy-preserving OR images. Finally, we show the generality of our method as a semi-supervised learning (SSL) method on the large-scale COCO dataset, where we achieve comparable results with as few as 1% of labeled supervision against a model trained with 100% labeled supervision.

Results

TaskDatasetMetricValueModel
Pose EstimationCOCO 1% labeled dataPerson Keypoint AP38.22AdaptOR-SSL
3DCOCO 1% labeled dataPerson Keypoint AP38.22AdaptOR-SSL
Instance SegmentationCOCO 1% labeled dataPerson Mask AP36.06AdaptOR-SSL
2D Human Pose EstimationCOCO 1% labeled dataPerson Keypoint AP38.22AdaptOR-SSL
Multi-Person Pose EstimationCOCO 1% labeled dataPerson Keypoint AP38.22AdaptOR-SSL
1 Image, 2*2 StitchiCOCO 1% labeled dataPerson Keypoint AP38.22AdaptOR-SSL

Related Papers

SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17Federated Learning for Commercial Image Sources2025-07-17Transformer-Based Person Identification via Wi-Fi CSI Amplitude and Phase Perturbations2025-07-17Privacy-Preserving Fusion for Multi-Sensor Systems Under Multiple Packet Dropouts2025-07-17Federated Learning in Open- and Closed-Loop EMG Decoding: A Privacy and Performance Perspective2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16A Privacy-Preserving Framework for Advertising Personalization Incorporating Federated Learning and Differential Privacy2025-07-16