RLAIF

Reinforcement Learning from AI Feedback

Reinforcement LearningIntroduced 200019 papers

Papers Using This Method

Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models2025-04-28R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning Distillation2025-04-07Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use2025-04-07Training Dialogue Systems by AI Feedback for Improving Overall Dialogue Impression2025-01-22PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment2024-10-17Exploring LLM-based Data Annotation Strategies for Medical Dialogue Preference Alignment2024-10-05Generative Reward Models2024-10-02MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models2024-08-30Applying RLAIF for Code Generation with API-usage in Lightweight LLMs2024-06-28Diminishing Stereotype Bias in Image Generation Model using Reinforcemenlent Learning Feedback2024-06-27Multi-objective Reinforcement learning from AI Feedback2024-06-11Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets2024-05-29Optimization-based Prompt Injection Attack to LLM-as-a-Judge2024-03-26CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences2024-03-14HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback2024-03-13A Critical Evaluation of AI Feedback for Aligning Large Language Models2024-02-19Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation2024-02-19FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models2024-02-16Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback2024-02-06