TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers

575,626 papers

DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO

Jinyoung Park, Jeehye Na, Jinyoung Kim, Hyunwoo J. Kim

2025-06-09Reinforcement LearningData AugmentationLarge Language Model
Paper
Explicit Preference Optimization: No Need for an Implicit Reward Model

Xiangkun Hu, Lemin Kong, Tong He, David Wipf

2025-06-09
PaperCode
Reinforcement Learning via Implicit Imitation Guidance

Perry Dong, Alec M. Lessing, Annie S. Chen, Chelsea Finn

2025-06-09Reinforcement LearningImitation LearningContinuous Control+1
Paper
Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

Lu Ma, Hao Liang, Meiyi Qiang, Lexiang Tang, Xiaochen Ma et al.

2025-06-09Reinforcement LearningLarge Language Model
PaperCode
Coordinating Search-Informed Reasoning and Reasoning-Guided Search in Claim Verification

Qisheng Hu, Quanyu Long, Wenya Wang

2025-06-09Claim Verification
Paper
LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization

Yixuan Yang, Zhen Luo, Tongsheng Ding, Junru Lu, Mingqi Gao et al.

2025-06-09Robot Navigation
Paper
Through the Valley: Path to Effective Long CoT Training for Small Language Models

Renjie Luo, Jiaxi Li, Chen Huang, Wei Lu

2025-06-09Reinforcement Learning
Paper
AbstRaL: Augmenting LLMs' Reasoning by Reinforcing Abstract Thinking

Silin Gao, Antoine Bosselut, Samy Bengio, Emmanuel Abbe

2025-06-09Reinforcement Learning
Paper
Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation

Xintong Duan, Yutong He, Fahim Tajwar, Ruslan Salakhutdinov, J. Zico Kolter et al.

2025-06-09MuJoCoOffline RLDecision Making
Paper
Decentralizing Multi-Agent Reinforcement Learning with Temporal Causal Information

Jan Corazza, Hadi Partovi Aria, Hyohun Kim, Daniel Neider, Zhe Xu et al.

2025-06-09Reinforcement LearningMulti-agent Reinforcement Learningreinforcement-learning
Paper
MiniCPM4: Ultra-Efficient LLMs on End Devices

MiniCPM Team, Chaojun Xiao, YuXuan Li, Xu Han, Yuzhuo Bai et al.

2025-06-09Large Language Model
PaperCode
CyberV: Cybernetics for Test-time Scaling in Video Understanding

Jiahao Meng, Shuyang Sun, Yue Tan, Lu Qi, Yunhai Tong et al.

2025-06-09Video Understanding
PaperCode
Novel software for continuous wavelet analysis enable EEG real-time analysis on portable computers

Shoichiro Nakanishi

2025-06-09EEG
Paper
Image Reconstruction as a Tool for Feature Analysis

Eduard Allakhverdov, Dmitrii Tarasov, Elizaveta Goncharova, Andrey Kuznetsov

2025-06-09Image ReconstructionInformativenessContrastive Learning
Paper
A Generative Physics-Informed Reinforcement Learning-Based Approach for Construction of Representative Drive Cycle

Amirreza Yasami, Mohammadali Tofigh, Mahdi Shahbakhti, Charles Robert Koch

2025-06-09
Paper
Lightweight Sequential Transformers for Blood Glucose Level Prediction in Type-1 Diabetes

Mirko Paolo Barbato, Giorgia Rigamonti, Davide Marelli, Paolo Napoletano

2025-06-09Management
Paper
Towards a Base-Station-on-Chip: RISC-V Hardware Acceleration for wireless communication

Javier Acevedo, Frank H. P. Fitzek

2025-06-09
Paper
Spatio-Temporal State Space Model For Efficient Event-Based Optical Flow

Muhammad Ahmed Humais, Xiaoqian Huang, Hussain Sajwani, Sajid Javed, Yahya Zweiri et al.

2025-06-09Optical Flow EstimationMotion EstimationEvent-based Optical Flow
PaperCode
Uncovering the Functional Roles of Nonlinearity in Memory

Manuel Brenner, Georgia Koppe

2025-06-09Speech Recognitionspeech-recognitionTime Series Forecasting
Paper
Cost-Optimal Active AI Model Evaluation

Anastasios N. Angelopoulos, Jacob Eisenstein, Jonathan Berant, Alekh Agarwal, Adam Fisch et al.

2025-06-09
Paper
PreviousPage 282 of 28782Next