Robotic Vision and Multi-View Synergy: Action and activity recognition in assisted living scenarios

Mohammad Hossein Bamorovat Abadi, Mohamad Reza Shahabian Alashti, Patrick Holthaus, Catherine Menon, Farshid Amirabdollahian

2024-09-0110th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob) 2024 9Human Activity Recognition Activity Recognition

Paper PDF Code(official)

Abstract

The significance of Human-Robot Interaction (HRI) is increasingly evident when integrating robotics within human-centric settings. A crucial component of effective HRI is Human Activity Recognition (HAR), which is instrumental in enabling robots to respond aptly in human presence, especially within Ambient Assisted Living (AAL) environments. Since robots are generally mobile and their visual perception is often compromised by motion and noise, this paper evaluates methods by merging the robot's mobile perspective with a static viewpoint utilising multi-view deep learning models. We introduce a dual-stream Convolutional 3D (C3D) model to improve vision-based HAR accuracy for robotic applications. Utilising the Robot House Multiview (RHM) dataset, which encompasses a robotic perspective along with three static views (Front, Back, Top), we examine the efficacy of our model and conduct comparisons with the dual-stream ConvNet and Slow-Fast models. The primary objective of this study is to enhance the accuracy of robot viewpoints by integrating them with static views using dual-stream models. The metrics for evaluation include Top-1 and Top-5 accuracy. Our findings reveal that the integration of static views with robotic perspectives significantly boosts HAR accuracy in both Top-1 and Top-5 metrics across all models tested. Moreover, the proposed dual-stream C3D model demonstrates superior performance compared to the other contemporary models in our evaluations.

Related Papers

ZKP-FedEval: Verifiable and Privacy-Preserving Federated Evaluation using Zero-Knowledge Proofs2025-07-15 SEZ-HARN: Self-Explainable Zero-shot Human Activity Recognition Network2025-06-25 Efficient Retail Video Annotation: A Robust Key Frame Generation Approach for Product and Customer Interaction Analysis2025-06-17 DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding2025-06-16 MORIC: CSI Delay-Doppler Decomposition for Robust Wi-Fi-based Human Activity Recognition2025-06-15 AgentSense: Virtual Sensor Data Generation Using LLM Agents in Simulated Home Environments2025-06-13 ScalableHD: Scalable and High-Throughput Hyperdimensional Computing Inference on Multi-Core CPUs2025-06-10 Scaling Human Activity Recognition: A Comparative Evaluation of Synthetic Data Generation and Augmentation Techniques2025-06-09