TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Value-Decomposition Networks For Cooperative Multi-Agent L...

Value-Decomposition Networks For Cooperative Multi-Agent Learning

Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel

2017-06-16Reinforcement LearningSMAC+Multi-agent Reinforcement Learningreinforcement-learning
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal. This class of learning problems is difficult because of the often large combined action and observation spaces. In the fully centralized and decentralized approaches, we find the problem of spurious rewards and a phenomenon we call the "lazy agent" problem, which arises due to partial observability. We address these problems by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions. We perform an experimental evaluation across a range of partially-observable multi-agent domains and show that learning such value-decompositions leads to superior results, in particular when combined with weight sharing, role information and information channels.

Results

TaskDatasetMetricValueModel
Multi-agent Reinforcement LearningOff_Hard_parallelMedian Win Rate15VDN
Multi-agent Reinforcement LearningDef_Outnumbered_sequentialMedian Win Rate15.6VDN
Multi-agent Reinforcement LearningOff_Complicated_parallelMedian Win Rate70VDN
Multi-agent Reinforcement LearningOff_Near_parallelMedian Win Rate90VDN
Multi-agent Reinforcement LearningDef_Armored_parallelMedian Win Rate5VDN
Multi-agent Reinforcement LearningOff_Distant_parallelMedian Win Rate85VDN
Multi-agent Reinforcement LearningDef_Infantry_parallelMedian Win Rate95VDN
Multi-agent Reinforcement LearningDef_Armored_sequentialMedian Win Rate96.9VDN
Multi-agent Reinforcement LearningDef_Infantry_sequentialMedian Win Rate96.9VDN
SMACOff_Hard_parallelMedian Win Rate15VDN
SMACDef_Outnumbered_sequentialMedian Win Rate15.6VDN
SMACOff_Complicated_parallelMedian Win Rate70VDN
SMACOff_Near_parallelMedian Win Rate90VDN
SMACDef_Armored_parallelMedian Win Rate5VDN
SMACOff_Distant_parallelMedian Win Rate85VDN
SMACDef_Infantry_parallelMedian Win Rate95VDN
SMACDef_Armored_sequentialMedian Win Rate96.9VDN
SMACDef_Infantry_sequentialMedian Win Rate96.9VDN

Related Papers

One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms2025-07-21CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17