TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/PPO

PPO

Reported on 24 benchmarks across 6 tasks · 4 papers · 3 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Methodology11 results

  • General Reinforcement LearningonObstacle Tower (Weak Gen) fixed
    Score· 2019-02-04
    1.2
    SOTA
    Obstacle Tower: A Generalization Challenge in Vision, Control, and PlanningarXiv:1902.01378
  • General Reinforcement LearningonObstacle Tower (Strong Gen) fixed
    Score· 2019-02-04
    0.6
    SOTA
    Obstacle Tower: A Generalization Challenge in Vision, Control, and PlanningarXiv:1902.01378
  • Reinforcement LearningonProcGen
    Mean Normalized Performance· uses extra data· 2020-09-09
    0.576
    best: 0.757 (PPG)
    Phasic Policy GradientarXiv:2009.04416
  • 3DonPyBullet HalfCheetah
    Return· 2020-05-12
    2254
    best: 2883 (SAC)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • 3DonPyBullet Ant
    Return· 2020-05-12
    2160
    best: 3459 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • 3DonPyBullet Walker2D
    Return· 2020-05-12
    1238
    best: 2341 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • 3DonPyBullet Hopper
    Return· 2020-05-12
    1622
    best: 2646 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • General Reinforcement LearningonObstacle Tower (No Gen) varied
    Score· 2019-02-04
    1
    best: 4.8 (RNB)
    Obstacle Tower: A Generalization Challenge in Vision, Control, and PlanningarXiv:1902.01378
  • General Reinforcement LearningonObstacle Tower (No Gen) fixed
    Score· 2019-02-04
    5
    best: 7 (RNB)
    Obstacle Tower: A Generalization Challenge in Vision, Control, and PlanningarXiv:1902.01378
  • General Reinforcement LearningonObstacle Tower (Weak Gen) varied
    Score· 2019-02-04
    0.8
    best: 3.4 (RNB)
    Obstacle Tower: A Generalization Challenge in Vision, Control, and PlanningarXiv:1902.01378
  • General Reinforcement LearningonObstacle Tower (Strong Gen) varied
    Score· 2019-02-04
    0.6
    best: 0.8 (RNB)
    Obstacle Tower: A Generalization Challenge in Vision, Control, and PlanningarXiv:1902.01378

Playing Games5 results

  • OpenAI GymonHumanoid-v4
    Average Return· 2017-07-20
    925.89
    best: 6923.22 (MEow)
    SOTA
    Proximal Policy Optimization AlgorithmsarXiv:1707.06347
  • OpenAI GymonHalfCheetah-v4
    Average Return· 2017-07-20
    6006.11
    best: 15836.04 (SAC)
    Proximal Policy Optimization AlgorithmsarXiv:1707.06347
  • OpenAI GymonAnt-v4
    Average Return· 2017-07-20
    608.97
    best: 6586.33 (MEow)
    Proximal Policy Optimization AlgorithmsarXiv:1707.06347
  • OpenAI GymonWalker2d-v4
    Average Return· 2017-07-20
    2739.81
    best: 5745.27 (SAC)
    Proximal Policy Optimization AlgorithmsarXiv:1707.06347
  • OpenAI GymonHopper-v4
    Average Return· 2017-07-20
    790.77
    best: 3332.99 (MEow)
    Proximal Policy Optimization AlgorithmsarXiv:1707.06347

Robots4 results

  • Continuous ControlonPyBullet HalfCheetah
    Return· 2020-05-12
    2254
    best: 2883 (SAC)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • Continuous ControlonPyBullet Ant
    Return· 2020-05-12
    2160
    best: 3459 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • Continuous ControlonPyBullet Walker2D
    Return· 2020-05-12
    1238
    best: 2341 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • Continuous ControlonPyBullet Hopper
    Return· 2020-05-12
    1622
    best: 2646 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719

Medical4 results

  • 3D Face ModellingonPyBullet HalfCheetah
    Return· 2020-05-12
    2254
    best: 2883 (SAC)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • 3D Face ModellingonPyBullet Ant
    Return· 2020-05-12
    2160
    best: 3459 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • 3D Face ModellingonPyBullet Walker2D
    Return· 2020-05-12
    1238
    best: 2341 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719
  • 3D Face ModellingonPyBullet Hopper
    Return· 2020-05-12
    1622
    best: 2646 (SAC gSDE)
    Smooth Exploration for Robotic Reinforcement LearningarXiv:2005.05719