TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers

575,626 papers

Are Reasoning Models More Prone to Hallucination?

Zijun Yao, Yantao Liu, Yanxu Chen, Jianhui Chen, Junfeng Fang et al.

2025-05-29Hallucination
Paper
Instance-Optimality for Private KL Distribution Estimation

Jiayuan Ye, Vitaly Feldman, Kunal Talwar

2025-05-29
Paper
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring

Benjamin Arnav, Pablo Bernabeu-Pérez, Nathan Helm-Burger, Tim Kostolansky, Hannes Whittingham et al.

2025-05-29Red Teaming
Paper
Learning Parametric Distributions from Samples and Preferences

Marc Jourdan, Gizem Yüce, Nicolas Flammarion

2025-05-29Language Modelling
PaperCode
Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Liangliang Zhang, Zhuorui Jiang, Hongliang Chi, Haoyang Chen, Mohammed Elkoumy et al.

2025-05-29Question AnsweringBenchmarkingGraph Question Answering+1
Paper
Emergent Risk Awareness in Rational Agents under Resource Constraints

Daniel Jarne Ornia, Nicholas Bishop, Joel Dyer, Wei-Chen Lee, Ani Calinescu et al.

2025-05-29Sequential Decision Making
Paper
Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments

Abhirup Chakravarty, Mark Brenchley, Trevor Breakspear, Ian Lewin, Yan Huang et al.

2025-05-29Automated Essay Scoring
Paper
How Does Response Length Affect Long-Form Factuality

James Xu Zhao, Jimmy Z. J. Liu, Bryan Hooi, See-Kiong Ng

2025-05-29Text GenerationForm
PaperCode
Stable Thompson Sampling: Valid Inference via Variance Inflation

Budhaditya Halder, Shubhayan Pan, Koulik Khamaru

2025-05-29Decision Making
Paper
JAPAN: Joint Adaptive Prediction Areas with Normalising-Flows

Eshant English, Christoph Lippert

2025-05-29Uncertainty QuantificationPrediction
Paper
Case-Based Reasoning Enhances the Predictive Power of LLMs in Drug-Drug Interaction

Guangyi Liu, Yongqi Zhang, Xunyuan Liu, Quanming Yao

2025-05-29
Paper
Theoretical Foundations of the Deep Copula Classifier: A Generative Approach to Modeling Dependent Features

Agnideep Aich, Ashit Baran Aich, Bruce Wade

2025-05-29
Paper
Exploring Scaling Laws for EHR Foundation Models

Sheng Zhang, Qin Liu, Naoto Usuyama, Cliff Wong, Tristan Naumann et al.

2025-05-29
Paper
ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

Peixuan Han, Zijia Liu, Jiaxuan You

2025-05-29
PaperCode
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

Yongjin Yang, Euiin Yi, Jongwoo Ko, Kimin Lee, Zhijing Jin et al.

2025-05-29Mathematical ReasoningLarge Language Model
Paper
Differential Information: An Information-Theoretic Perspective on Preference Optimization

Yunjae Won, Hyunji Lee, Hyeonbin Hwang, Minjoon Seo

2025-05-29Question AnsweringInstruction Following
Paper
Model Immunization from a Condition Number Perspective

Amber Yijia Zheng, Cedar Site Bai, Brian Bullins, Raymond A. Yeh

2025-05-29
PaperCode
EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast

Shreeram Suresh Chandra, Lucas Goncalves, Junchen Lu, Carlos Busso, Berrak Sisman et al.

2025-05-29Cross-Modal Retrievalcross-modal alignmentContrastive Learning
Paper
MuLoCo: Muon is a practical inner optimizer for DiLoCo

Benjamin Thérien, Xiaolong Huang, Irina Rish, Eugene Belilovsky

2025-05-29Quantization
Paper
SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA

Minrui Luo, Fuhang Kuang, Yu Wang, Zirui Liu, Tianxing He et al.

2025-05-29Navigateparameter-efficient fine-tuningWorld Knowledge
Paper
PreviousPage 442 of 28782Next