TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers

575,626 papers

Diverse Prototypical Ensembles Improve Robustness to Subpopulation Shift

Minh Nguyen Nhat To, Paul F RWilson, Viet Nguyen, Mohamed Harmanani, Michael Cooper et al.

2025-05-29
PaperCode
Number of Clusters in a Dataset: A Regularized K-means Approach

Behzad Kamgar-Parsi, Behrooz Kamgar-Parsi

2025-05-29
Paper
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

Yunze Man, De-An Huang, Guilin Liu, Shiwei Sheng, Shilong Liu et al.

2025-05-29CVPR 2025 1Multimodal Reasoning
Paper
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

Sihan Yang, Runsen Xu, Yiman Xie, Sizhe Yang, Mo Li et al.

2025-05-29Spatial ReasoningVisual Question Answering (VQA)Multiple-choice
Paper
Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch

Aneeshan Sain, Subhajit Maity, Pinaki Nath Chowdhury, Subhadeep Koley, Ayan Kumar Bhunia et al.

2025-05-29CVPR 2025 1Sketch-Based Image RetrievalKnowledge DistillationImage Retrieval
Paper
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers

Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag

2025-05-29DenoisingImage GenerationVisual Storytelling
Paper
Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action Models

Haohan Chi, Huan-ang Gao, Ziming Liu, Jianing Liu, Chenyu Liu et al.

2025-05-29Question AnsweringAutonomous DrivingDiagnostic+2
PaperCode
Rooms from Motion: Un-posed Indoor 3D Object Detection as Localization and Mapping

Justin Lazarow, Kai Kang, Afshin Dehghan

2025-05-29object-detection3D Object DetectionObject Detection
Paper
ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks

Akashah Shabbir, Muhammad Akhtar Munir, Akshay Dudhane, Muhammad Umer Sheikh, Muhammad Haris Khan et al.

2025-05-29Spatial Reasoning
PaperCode
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Diankun Wu, Fangfu Liu, Yi-Hsin Hung, Yueqi Duan

2025-05-29Spatial Reasoning
Paper
To Trust Or Not To Trust Your Vision-Language Model's Prediction

Hao Dong, Moru Liu, Jian Liang, Eleni Chatzi, Olga Fink et al.

2025-05-29Transfer Learning
PaperCode
Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You Need

Qiang Wang, Xiang Song, Yuhang He, Jizhou Han, Chenhao Ding et al.

2025-05-29CVPR 2025 1Image Classificationparameter-efficient fine-tuningAll+4
Paper
DarkDiff: Advancing Low-Light Raw Enhancement by Retasking Diffusion Models for Camera ISP

Amber Yijia Zheng, Yu Zhang, Jun Hu, Raymond A. Yeh, Chen Chen et al.

2025-05-29
Paper
MAGREF: Masked Guidance for Any-Reference Video Generation

Yufan Deng, Xun Guo, Yuanyang Yin, Jacob Zhiyuan Fang, Yiding Yang et al.

2025-05-29Single-Domain Subject-to-VideoOpen-Domain Subject-to-VideoHuman-Domain Subject-to-Video+1
PaperCode
FMG-Det: Foundation Model Guided Robust Object Detection

Darryl Hannan, Timothy Doster, Henry Kvinge, Adam Attarian, Yijing Watkins et al.

2025-05-29Multiple Instance LearningRobust Object Detectionobject-detection+1
Paper
AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views

Lihan Jiang, Yucheng Mao, Linning Xu, Tao Lu, Kerui Ren et al.

2025-05-29Neural RenderingNovel View Synthesis
Paper
Skin Lesion Phenotyping via Nested Multi-modal Contrastive Learning

Dionysis Christopoulos, Sotiris Spanos, Eirini Baltzi, Valsamis Ntouskos, Konstantinos Karantzalos et al.

2025-05-29Lesion ClassificationContrastive LearningSkin Lesion Classification
Paper
CLDTracker: A Comprehensive Language Description for Visual Tracking

Mohamad Alansari, Sajid Javed, Iyyakutti Iyappan Ganapathi, Sara Alansari, Muzammal Naseer et al.

2025-05-29Visual TrackingImage Captioning
PaperCode
DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers

Li Ren, Chen Chen, Liqiang Wang, Kien Hua

2025-05-29CVPR 2025 1Metric Learningparameter-efficient fine-tuningVisual Prompt Tuning
PaperCode
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos

Tingyu Song, Tongyan Hu, Guo Gan, Yilun Zhao

2025-05-29Question AnsweringVideo Question AnsweringVideo Generation
PaperCode
PreviousPage 448 of 28782Next