TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers

575,626 papers

Bridging Continuous-time LQR and Reinforcement Learning via Gradient Flow of the Bellman Error

Armin Gießler, Albertus Johannes Malan, Sören Hohmann

2025-06-11Reinforcement Learning
Paper
Dataset of News Articles with Provenance Metadata for Media Relevance Assessment

Tomas Peterka, Matyas Bohacek

2025-06-11Misinformation
Paper
IntPhys 2: Benchmarking Intuitive Physics Understanding In Complex Synthetic Environments

Florian Bordes, Quentin Garrido, Justine T Kao, Adina Williams, Michael Rabbat et al.

2025-06-11Benchmarking
PaperCode
The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge

Haoru Wang, Kai Ye, Yangyan Li, Wenzheng Chen, Baoquan Chen et al.

2025-06-11Novel View SynthesisGeneralizable Novel View Synthesis3DGS
Paper
EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks

Athinoulla Konstantinou, Georgios Leontidis, Mamatha Thota, Aiden Durrant

2025-06-11Pose Estimation
PaperCode
CEM-FBGTinyDet: Context-Enhanced Foreground Balance with Gradient Tuning for tiny Objects

Tao Liu, Zhenchao Cui

2025-06-11Robust classificationobject-detectionObject Detection
Paper
From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models

Irving Fang, Juexiao Zhang, Shengbang Tong, Chen Feng

2025-06-11Imitation LearningVision-Language-Action
Paper
LEO-VL: Towards 3D Vision-Language Generalists via Data Scaling with Efficient Representation

Jiangyong Huang, Xiaojian Ma, Xiongkun Linghu, Yue Fan, Junchao He et al.

2025-06-11
Paper
CausalVQA: A Physically Grounded Causal Reasoning Benchmark for Video Models

Aaron Foss, Chloe Evans, Sasha Mitts, Koustuv Sinha, Ammar Rizvi et al.

2025-06-11Question AnsweringDescriptiveVideo Question Answering+1
PaperCode
Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy

Sushant Gautam, Michael A. Riegler, Pål Halvorsen

2025-06-11Question AnsweringVisual Question Answering (VQA)Medical Visual Question Answering+1
Paper
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing

Junfei Wu, Jian Guan, Kaituo Feng, Qiang Liu, Shu Wu et al.

2025-06-11Spatial ReasoningMultimodal Reasoning
PaperCode
Efficient Part-level 3D Object Generation via Dual Volume Packing

Jiaxiang Tang, Ruijie Lu, Zhaoshuo Li, Zekun Hao, Xuan Li et al.

2025-06-11
PaperCode
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation

Zijie Wu, Chaohui Yu, Fan Wang, Xiang Bai

2025-06-11
Paper
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Mido Assran, Adrien Bardes, David Fan, Quentin Garrido, Russell Howes et al.

2025-06-11Question AnsweringSelf-Supervised LearningAction Anticipation+2
PaperCode
InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions

Zhenzhi Wang, Jiaqi Yang, Jianwen Jiang, Chao Liang, Gaojie Lin et al.

2025-06-11Human-Object Interaction Detection
Paper
Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes

Yiming Dou, Wonseok Oh, Yuqing Luo, Antonio Loquercio, Andrew Owens et al.

2025-06-11CVPR 2025 1
PaperCode
eFlesh: Highly customizable Magnetic Touch Sensing using Cut-Cell Microstructures

Venkatesh Pattabiraman, Zizhou Huang, Daniele Panozzo, Denis Zorin, Lerrel Pinto et al.

2025-06-11
Paper
DGS-LRM: Real-Time Deformable 3D Gaussian Reconstruction From Monocular Videos

Chieh Hubert Lin, Zhaoyang Lv, Songyin Wu, Zhen Xu, Thu Nguyen-Phuoc et al.

2025-06-11Dynamic Reconstruction
Paper
ELBO-T2IAlign: A Generic ELBO-Based Method for Calibrating Pixel-level Text-Image Alignment in Diffusion Models

Qin Zhou, Zhiyang Zhang, Jinglong Wang, XiaoBin Li, Jing Zhang et al.

2025-06-11SegmentationSemantic SegmentationImage Generation+1
Paper
Accurate and efficient zero-shot 6D pose estimation with frozen foundation models

Andrea Caraffa, Davide Boscaini, Fabio Poiesi

2025-06-11SegmentationSemantic SegmentationPose Estimation+2
Paper
PreviousPage 245 of 28782Next