TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/QTRAN: Learning to Factorize with Transformation for Coope...

QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning

Kyunghwan Son, Daewoo Kim, Wan Ju Kang, David Earl Hostallero, Yung Yi

2019-05-14Reinforcement LearningSMAC+Multi-agent Reinforcement Learningreinforcement-learning
PaperPDFCodeCodeCode(official)Code

Abstract

We explore value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently. However, VDN and QMIX are representative examples that use the idea of factorization of the joint action-value function into individual ones for decentralized execution. VDN and QMIX address only a fraction of factorizable MARL tasks due to their structural constraint in factorization such as additivity and monotonicity. In this paper, we propose a new factorization method for MARL, QTRAN, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions. QTRAN guarantees more general factorization than VDN or QMIX, thus covering a much wider class of MARL tasks than does previous methods. Our experiments for the tasks of multi-domain Gaussian-squeeze and modified predator-prey demonstrate QTRAN's superior performance with especially larger margins in games whose payoffs penalize non-cooperative behavior more aggressively.

Results

TaskDatasetMetricValueModel
Multi-agent Reinforcement LearningDef_Outnumbered_sequentialMedian Win Rate81.3QTRAN
Multi-agent Reinforcement LearningDef_Armored_parallelMedian Win Rate5QTRAN
Multi-agent Reinforcement LearningDef_Infantry_parallelMedian Win Rate100QTRAN
Multi-agent Reinforcement LearningDef_Armored_sequentialMedian Win Rate93.8QTRAN
Multi-agent Reinforcement LearningDef_Infantry_sequentialMedian Win Rate100QTRAN
SMACDef_Outnumbered_sequentialMedian Win Rate81.3QTRAN
SMACDef_Armored_parallelMedian Win Rate5QTRAN
SMACDef_Infantry_parallelMedian Win Rate100QTRAN
SMACDef_Armored_sequentialMedian Win Rate93.8QTRAN
SMACDef_Infantry_sequentialMedian Win Rate100QTRAN

Related Papers

One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms2025-07-21CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17