TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers

575,626 papers

Knowledge-guided Contextual Gene Set Analysis Using Large Language Models

Zhizheng Wang, Chi-Ping Day, Chih-Hsuan Wei, Qiao Jin, Robert Leaman et al.

2025-06-04Benchmarking
Paper
AUTOCT: Automating Interpretable Clinical Trial Prediction with LLM Agents

Fengze Liu, Haoyu Wang, Joonhyuk Cho, Dan Roth, Andrew W. Lo et al.

2025-06-04Drug DiscoveryPrediction
Paper
Schema Generation for Large Knowledge Graphs Using Large Language Models

Bohui Zhang, Yuan He, Lydia Pintscher, Albert Meroño Peñuela, Elena Simperl et al.

2025-06-04Knowledge Graphs
Paper
CogMath: Assessing LLMs' Authentic Mathematical Ability from a Human Cognitive Perspective

Jiayu Liu, Zhenya Huang, Wei Dai, Cheng Cheng, Jinze Wu et al.

2025-06-04
Paper
An AI-Based Public Health Data Monitoring System

Ananya Joshi, Nolan Gormley, Richa Gadgil, Tina Townes, Roni Rosenfeld et al.

2025-06-04Anomaly DetectionDecision Making
Paper
Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance

Xixi Wang, Miguel Costa, Jordanka Kovaceva, Shuai Wang, Francisco C. Pereira et al.

2025-06-04Question AnsweringSemantic SimilaritySemantic Textual Similarity
Paper
A Statistical Physics of Language Model Reasoning

Jack David Carson, Amir Reisizadeh

2025-06-04Language Modelling
Paper
Automated Skill Discovery for Language Agents through Exploration and Iterative Feedback

Yongjin Yang, Sinjae Kang, Juyong Lee, Dongjun Lee, Se-Young Yun et al.

2025-06-04Large Language Model
Paper
Rectified Sparse Attention

Yutao Sun, Tianzhu Ye, Li Dong, Yuqing Xia, Jian Chen et al.

2025-06-04MathLanguage Modelling
Paper
Knockout LLM Assessment: Using Large Language Models for Evaluations through Iterative Pairwise Comparisons

Isik Baran Sandan, Tu Anh Dinh, Jan Niehues

2025-06-04Machine Translation
Paper
Beyond Memorization: A Rigorous Evaluation Framework for Medical Knowledge Editing

Shigeng Chen, Linhao Luo, Zhangchi Qiu, Yanan Cao, Carl Yang et al.

2025-06-04knowledge editingMemorization
PaperCode
Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model

Haibin Wu, Yuxuan Hu, Ruchao Fan, Xiaofei Wang, Kenichi Kumatani et al.

2025-06-04Question AnsweringSpoken Dialogue SystemsLanguage Modelling
Paper
Behavioural vs. Representational Systematicity in End-to-End Models: An Opinionated Survey

Ivan Vegner, Sydelle de Souza, Valentin Forch, Martha Lewis, Leonidas A. A. Doumas et al.

2025-06-04Systematic Generalization
Paper
DRE: An Effective Dual-Refined Method for Integrating Small and Large Language Models in Open-Domain Dialogue Evaluation

Kun Zhao, Bohao Yang, Chen Tang, Siyuan Dai, Haoteng Tang et al.

2025-06-04Dialogue Evaluation
Paper
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL

Yue Gong, Chuan Lei, Xiao Qin, Kapil Vaidya, Balakrishnan Narayanaswamy et al.

2025-06-04Text-To-SQL
Paper
Aligning Large Language Models with Implicit Preferences from User-Generated Content

Zhaoxuan Tan, Zheng Li, Tianyi Liu, Haodong Wang, Hyokun Yun et al.

2025-06-04
Paper
Zero-Shot Open-Schema Entity Structure Discovery

Xueqiang Xu, Jinfeng Xiao, James Barry, Mohab Elkaref, Jiaru Zou et al.

2025-06-04Attributegraph construction
Paper
Empaths at SemEval-2025 Task 11: Retrieval-Augmented Approach to Perceived Emotions Prediction

Lev Morozov, Aleksandr Mogilevskii, Alexander Shirnin

2025-06-04
Paper
Unpacking Let Alone: Human-Scale Models Generalize to a Rare Construction in Form but not Meaning

Wesley Scivetti, Tatsuya Aoyama, Ethan Wilcox, Nathan Schneider

2025-06-04Form
Paper
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale

ran Xu, Yuchen Zhuang, Yishan Zhong, Yue Yu, Xiangru Tang et al.

2025-06-04BenchmarkingLarge Language ModelPrivacy Preserving+1
Paper
PreviousPage 348 of 28782Next