TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Making the Invisible Visible: Action Recognition Through W...

Making the Invisible Visible: Action Recognition Through Walls and Occlusions

Tianhong Li, Lijie Fan, Ming-Min Zhao, Yingcheng Liu, Dina Katabi

2019-09-20ICCV 2019 103D Human Pose EstimationSkeleton Based Action RecognitionAction Recognition
PaperPDF

Abstract

Understanding people's actions and interactions typically depends on seeing them. Automating the process of action recognition from visual data has been the topic of much research in the computer vision community. But what if it is too dark, or if the person is occluded or behind a wall? In this paper, we introduce a neural network model that can detect human actions through walls and occlusions, and in poor lighting conditions. Our model takes radio frequency (RF) signals as input, generates 3D human skeletons as an intermediate representation, and recognizes actions and interactions of multiple people over time. By translating the input to an intermediate skeleton-based representation, our model can learn from both vision-based and RF-based datasets, and allow the two tasks to help each other. We show that our model achieves comparable accuracy to vision-based action recognition systems in visible scenarios, yet continues to work accurately when people are not visible, hence addressing scenarios that are beyond the limit of today's vision-based action recognition.

Results

TaskDatasetMetricValueModel
VideoPKU-MMDmAP@0.50 (CS)92.9RF-Action
VideoPKU-MMDmAP@0.50 (CV)94.4RF-Action
VideoNTU RGB+DAccuracy (CS)86.8RF-Action
VideoNTU RGB+DAccuracy (CV)91.6RF-Action
Temporal Action LocalizationPKU-MMDmAP@0.50 (CS)92.9RF-Action
Temporal Action LocalizationPKU-MMDmAP@0.50 (CV)94.4RF-Action
Temporal Action LocalizationNTU RGB+DAccuracy (CS)86.8RF-Action
Temporal Action LocalizationNTU RGB+DAccuracy (CV)91.6RF-Action
Zero-Shot LearningPKU-MMDmAP@0.50 (CS)92.9RF-Action
Zero-Shot LearningPKU-MMDmAP@0.50 (CV)94.4RF-Action
Zero-Shot LearningNTU RGB+DAccuracy (CS)86.8RF-Action
Zero-Shot LearningNTU RGB+DAccuracy (CV)91.6RF-Action
Activity RecognitionPKU-MMDmAP@0.50 (CS)92.9RF-Action
Activity RecognitionPKU-MMDmAP@0.50 (CV)94.4RF-Action
Activity RecognitionNTU RGB+DAccuracy (CS)86.8RF-Action
Activity RecognitionNTU RGB+DAccuracy (CV)91.6RF-Action
Action LocalizationPKU-MMDmAP@0.50 (CS)92.9RF-Action
Action LocalizationPKU-MMDmAP@0.50 (CV)94.4RF-Action
Action LocalizationNTU RGB+DAccuracy (CS)86.8RF-Action
Action LocalizationNTU RGB+DAccuracy (CV)91.6RF-Action
Pose EstimationRF-MMDmAP (@0.1, Through-wall)86.5RF-Action
Pose EstimationRF-MMDmAP (@0.1, Visible)90.1RF-Action
Action DetectionPKU-MMDmAP@0.50 (CS)92.9RF-Action
Action DetectionPKU-MMDmAP@0.50 (CV)94.4RF-Action
Action DetectionNTU RGB+DAccuracy (CS)86.8RF-Action
Action DetectionNTU RGB+DAccuracy (CV)91.6RF-Action
3D Action RecognitionPKU-MMDmAP@0.50 (CS)92.9RF-Action
3D Action RecognitionPKU-MMDmAP@0.50 (CV)94.4RF-Action
3D Action RecognitionNTU RGB+DAccuracy (CS)86.8RF-Action
3D Action RecognitionNTU RGB+DAccuracy (CV)91.6RF-Action
3DRF-MMDmAP (@0.1, Through-wall)86.5RF-Action
3DRF-MMDmAP (@0.1, Visible)90.1RF-Action
Action RecognitionPKU-MMDmAP@0.50 (CS)92.9RF-Action
Action RecognitionPKU-MMDmAP@0.50 (CV)94.4RF-Action
Action RecognitionNTU RGB+DAccuracy (CS)86.8RF-Action
Action RecognitionNTU RGB+DAccuracy (CV)91.6RF-Action
1 Image, 2*2 StitchiRF-MMDmAP (@0.1, Through-wall)86.5RF-Action
1 Image, 2*2 StitchiRF-MMDmAP (@0.1, Visible)90.1RF-Action

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Systematic Comparison of Projection Methods for Monocular 3D Human Pose Estimation on Fisheye Images2025-06-24Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22