Papers

575,626 papers

PARROT: Synergizing Mamba and Attention-based SSL Pre-Trained Models via Parallel Branch Hadamard Optimal Transport for Speech Emotion Recognition

Orchid Chetia Phukan, Mohd Mujtaba Akhtar, Girish, Swarup Ranjan Behera, Jaya Sai Kiran Patibandla et al.

2025-06-01Self-Supervised LearningSpeech Emotion RecognitionEmotion Recognition

Paper

Towards Fusion of Neural Audio Codec-based Representations with Spectral for Heart Murmur Classification via Bandit-based Cross-Attention Mechanism

Orchid Chetia Phukan, Girish, Mohd Mujtaba Akhtar, Swarup Ranjan Behera, Priyabrata Mallick et al.

2025-06-01Rhythm

Paper

Source Tracing of Synthetic Speech Systems Through Paralinguistic Pre-Trained Representations

Girish, Mohd Mujtaba Akhtar, Orchid Chetia Phukan, Drishti Singh, Swarup Ranjan Behera et al.

2025-06-01Speaker RecognitionSynthetic Speech DetectionRhythm+2

Paper

GigaAM: Efficient Self-Supervised Learner for Speech Recognition

Aleksandr Kutsakov, Alexandr Maximenko, Georgii Gospodinov, Pavel Bogomolov, Fyodor Minkin et al.

2025-06-01Speech RecognitionAutomatic Speech Recognitionspeech-recognition+3

Paper Code

NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction

Qichao Wang, Ziqiao Meng, Wenqian Cui, Yifei Zhang, Pengcheng Wu et al.

2025-06-01Language Modelling

Paper

Choices and their Provenance: Explaining Stable Solutions of Abstract Argumentation Frameworks

Bertram Ludäscher, Yilin Xia, Shawn Bowers

2025-06-01

Paper

DriveMind: A Dual-VLM based Reinforcement Learning Framework for Autonomous Driving

Dawood Wasif, Terrence J Moore, Chandan K Reddy, Jin-Hee Cho

2025-06-01Reinforcement LearningAutonomous Driving

Paper

Towards Predicting Any Human Trajectory In Context

Ryo Fujii, Hideo Saito, Ryo Hachiuma

2025-06-01Pedestrian Trajectory PredictionTrajectory Prediction

Paper

Enhancing Speech Instruction Understanding and Disambiguation in Robotics via Speech Prosody

David Sasu, Kweku Andoh Yamoah, Benedict Quartey, Natalie Schluter

2025-06-01Speech Recognitionspeech-recognition

Paper

Accelerated Learning with Linear Temporal Logic using Differentiable Simulation

Alper Kamil Bozkurt, Calin Belta, Ming C. Lin

2025-06-01

Paper

Humanoid World Models: Open World Foundation Models for Humanoid Robotics

Muhammad Qasim Ali, Aditya Sridhar, Shahbuland Matiana, Alex Wong, Mohammad Al-Sharman et al.

2025-06-01

Paper

OG-VLA: 3D-Aware Vision Language Action Model via Orthographic Image Generation

Ishika Singh, Ankit Goyal, Stan Birchfield, Dieter Fox, Animesh Garg et al.

2025-06-01Large Language ModelRobot ManipulationImage Generation+1

Paper

Test Automation for Interactive Scenarios via Promptable Traffic Simulation

Augusto Mondelli, Yueshan Li, Alessandro Zanardi, Emilio Frazzoli

2025-06-01Bayesian Optimization

Paper

How Programming Concepts and Neurons Are Shared in Code Language Models

Amir Hossein Kargaran, Yihong Liu, François Yvon, Hinrich Schütze

2025-06-01Translation

Paper Code

Deformable registration and generative modelling of aortic anatomies by auto-decoders and neural ODEs

Riccardo Tenderini, Luca Pegolotti, Fanwei Kong, Stefano Pagani, Francesco Regazzoni et al.

2025-06-01Anatomy

Paper

EEG2TEXT-CN: An Exploratory Study of Open-Vocabulary Chinese Text-EEG Alignment via Large Language Model and Contrastive Learning on ChineseEEG

Jacky Tai-Yu Lu, Jung Chiang, Chi-Sheng Chen, Anna Nai-Yun Tung, Hsiang Wei Hu et al.

2025-06-01Text GenerationLarge Language ModelContrastive Learning+2

Paper

Multiverse Through Deepfakes: The MultiFakeVerse Dataset of Person-Centric Visual and Conceptual Manipulations

Parul Gupta, Shreya Ghosh, Tom Gedeon, Thanh-Toan Do, Abhinav Dhall et al.

2025-06-01Human-Object Interaction DetectionDeepFake DetectionFace Swapping

Paper Code

Camera Trajectory Generation: A Comprehensive Survey of Methods, Metrics, and Future Directions

Zahra Dehghanian, Pouya Ardekhani, Amir Vahedi, Hamid Beigy, Hamid R. Rabiee et al.

2025-06-01Visual Storytelling

Paper

CountingFruit: Real-Time 3D Fruit Counting with Language-Guided Semantic Gaussian Splatting

Fengze Li, Yangle Liu, Jieming Ma, Hai-Ning Liang, Yaochun Shen et al.

2025-06-01Neural Rendering3D Reconstruction

Paper

Language-Guided Multi-Agent Learning in Simulations: A Unified Framework and Evaluation

Zhengyang Li

2025-06-01Zero-shot GeneralizationStarcraft IIMulti-agent Reinforcement Learning+2

Paper

PreviousPage 393 of 28782Next