Papers

575,626 papers

DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding

Thomas Kreutz, Max Mühlhäuser, Alejandro Sanchez Guinea

2025-06-16Human Activity RecognitionPerson Re-IdentificationMoment Retrieval+3

Paper

Fake it till You Make it: Reward Modeling as Discriminative Prediction

Runtao Liu, Jiahao Zhan, Yingqing He, Chen Wei, Alan Yuille et al.

2025-06-16

Paper

ZipVoice: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Han Zhu, Wei Kang, Zengwei Yao, Liyong Guo, Fangjun Kuang et al.

2025-06-16Text to SpeechSpeech SynthesisText-To-Speech Synthesis+1

Paper Code

Do Music Preferences Reflect Cultural Values? A Cross-National Analysis Using Music Embedding and World Values Survey

Yongjae Kim, Seongchan Park

2025-06-16

Paper

Boundary-Informed Sound Field Reconstruction

David Sundström, Filip Elvander, Andreas Jakobsson

2025-06-16

Paper

Instance-Specific Test-Time Training for Speech Editing in the Wild

Taewoo Kim, Uijong Lee, Hayoung Park, Choongsang Cho, Nam In Park et al.

2025-06-16

Paper

Stereo sound event localization and detection based on PSELDnet pretraining and BiMamba sequence modeling

Wenmiao Gao, Yang Xiao

2025-06-16Sound Event Localization and Detection

Paper

Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems

Tuan Nguyen, Long-Vu Hoang, Huy-Dat Tran

2025-06-16Speech Recognitionspeech-recognitionLanguage Modelling

Paper

Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model

Shaolei Zhang, Shoutao Guo, Qingkai Fang, Yan Zhou, Yang Feng et al.

2025-06-16Large Language Model

Paper Code

SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms

Sirui Li, Shuai Wang, Zhijun Liu, Zhongjie Jiang, Yannan Wang et al.

2025-06-16Denoising

Paper

Seewo's Submission to MLC-SLM: Lessons learned from Speech Reasoning Language Models

Bo Li, Chengben Xu, Wufeng Zhang

2025-06-16Speech RecognitionAutomatic Speech RecognitionAutomatic Speech Recognition (ASR)+4

Paper

Polyra Swarms: A Shape-Based Approach to Machine Learning

Simon Klüttermann, Emmanuel Müller

2025-06-16Anomaly Detection

Paper

A Survey on World Models Grounded in Acoustic Physical Information

Xiaoliang Chen, Le Chang, Xin Yu, Yunhe Huang, Xianling Tu et al.

2025-06-16Autonomous Driving

Paper Code

A Novel ViDAR Device With Visual Inertial Encoder Odometry and Reinforcement Learning-Based Active SLAM Method

Zhanhua Xin, Zhihao Wang, Shenghao Zhang, Wanchao Chi, Yan Meng et al.

2025-06-16Sensor FusionSimultaneous Localization and Mapping

Paper

Open-Set LiDAR Panoptic Segmentation Guided by Uncertainty-Aware Learning

Rohit Mohan, Julia Hindel, Florian Drews, Claudius Gläser, Daniele Cattaneo et al.

2025-06-16Autonomous VehiclesPanoptic SegmentationNavigate+2

Paper

JENGA: Object selection and pose estimation for robotic grasping from a stack

Sai Srinivas Jeevanandam, Sandeep Inuganti, Shreedhar Govil, Didier Stricker, Jason Rambach et al.

2025-06-16BenchmarkingRobotic GraspingPose Estimation

Paper

Towards a Formal Specification for Self-organized Shape Formation in Swarm Robotics

YR Darr, MA Niazi

2025-06-16

Paper

Block-wise Adaptive Caching for Accelerating Diffusion Policy

Kangye Ji, Yuan Meng, Hanyun Cui, Ye Li, Shengjia Hua et al.

2025-06-16DenoisingAction GenerationVision-Language-Action

Paper

What Matters in Learning from Large-Scale Datasets for Robot Manipulation

Vaibhav Saxena, Matthew Bronars, Nadun Ranawaka Arachchige, Kuancheng Wang, Woo Chul Shin et al.

2025-06-16Imitation LearningRobot ManipulationRetrieval

Paper

Parallel Branch Model Predictive Control on GPUs

Luyao Zhang, Chenghuai Lin, Sergio Grammatico

2025-06-16

Paper

PreviousPage 186 of 28782Next