TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Self-Imitation Learning

Self-Imitation Learning

Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee

2018-06-14ICML 2018 7MuJoCoImitation LearningAtari Games
PaperPDFCodeCodeCodeCode(official)

Abstract

This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent's past good decisions. This algorithm is designed to verify our hypothesis that exploiting past good experiences can indirectly drive deep exploration. Our empirical results show that SIL significantly improves advantage actor-critic (A2C) on several hard exploration Atari games and is competitive to the state-of-the-art count-based exploration methods. We also show that SIL improves proximal policy optimization (PPO) on MuJoCo tasks.

Results

TaskDatasetMetricValueModel
Atari GamesAtari 2600 BoxingScore99.6A2C + SIL
Atari GamesAtari 2600 Double DunkScore21.5A2C + SIL
Atari GamesAtari 2600 Ms. PacmanScore4025.1A2C + SIL
Atari GamesAtari 2600 CentipedeScore7559.5A2C + SIL
Atari GamesAtari 2600 TutankhamScore340.5A2C + SIL
Atari GamesAtari 2600 FreewayScore32.2A2C + SIL
Atari GamesAtari 2600 PongScore20.9A2C + SIL
Atari GamesAtari 2600 EnduroScore1205.1A2C + SIL
Atari GamesAtari 2600 KrullScore10614.6A2C + SIL
Atari GamesAtari 2600 BreakoutScore452A2C + SIL
Atari GamesAtari 2600 FrostbiteScore6289.8A2C + SIL
Atari GamesAtari 2600 Montezuma's RevengeScore1100A2C + SIL
Atari GamesAtari 2600 GopherScore23304.2A2C + SIL
Atari GamesAtari 2600 Space InvadersScore2951.7A2C + SIL
Atari GamesAtari 2600 James BondScore310.8A2C + SIL
Atari GamesAtari 2600 AmidarScore1362A2C + SIL
Atari GamesAtari 2600 TennisScore-17.3A2C + SIL
Atari GamesAtari 2600 Crazy ClimberScore130185.8A2C + SIL
Atari GamesAtari 2600 AsteroidsScore2259.4A2C + SIL
Atari GamesAtari 2600 GravitarScore1874.2A2C + SIL
Atari GamesAtari 2600 Time PilotScore10811.7A2C + SIL
Atari GamesAtari 2600 Demon AttackScore10140.5A2C + SIL
Atari GamesAtari 2600 Battle ZoneScore25075A2C + SIL
Atari GamesAtari 2600 Beam RiderScore2366.2A2C + SIL
Atari GamesAtari 2600 AsterixScore17984.2A2C + SIL
Atari GamesAtari 2600 Kung-Fu MasterScore34449.2A2C + SIL
Atari GamesAtari 2600 BowlingScore31.1A2C + SIL
Atari GamesAtari 2600 KangarooScore2888.3A2C + SIL
Atari GamesAtari 2600 AssaultScore1812A2C + SIL
Atari GamesAtari 2600 AlienScore2242.2A2C + SIL
Atari GamesAtari 2600 Fishing DerbyScore55.8A2C + SIL
Atari GamesAtari 2600 SeaquestScore2456.5A2C + SIL
Atari GamesAtari 2600 Chopper CommandScore6710A2C + SIL
Atari GamesAtari 2600 Video PinballScore461522.4A2C + SIL
Atari GamesAtari 2600 Wizard of WorScore7088.3A2C + SIL
Atari GamesAtari 2600 ZaxxonScore9164.2A2C + SIL
Atari GamesAtari 2600 RobotankScore10.5A2C + SIL
Atari GamesAtari 2600 Name This GameScore14958.2A2C + SIL
Atari GamesAtari 2600 Star GunnerScore31309.2A2C + SIL
Atari GamesAtari 2600 Ice HockeyScore-2.4A2C + SIL
Atari GamesAtari 2600 AtlantisScore3084781.7A2C + SIL
Atari GamesAtari 2600 HEROScore33156.7A2C + SIL
Atari GamesAtari 2600 Bank HeistScore1137.8A2C + SIL
Atari GamesAtari 2600 Private EyeScore661.2A2C + SIL
Atari GamesAtari 2600 Q*BertScore104975.6A2C + SIL
Atari GamesAtari 2600 River RaidScore14306.1A2C + SIL
Atari GamesAtari 2600 Road RunnerScore57071.7A2C + SIL
Atari GamesAtari 2600 Up and DownScore53314.6A2C + SIL
Video GamesAtari 2600 BoxingScore99.6A2C + SIL
Video GamesAtari 2600 Double DunkScore21.5A2C + SIL
Video GamesAtari 2600 Ms. PacmanScore4025.1A2C + SIL
Video GamesAtari 2600 CentipedeScore7559.5A2C + SIL
Video GamesAtari 2600 TutankhamScore340.5A2C + SIL
Video GamesAtari 2600 FreewayScore32.2A2C + SIL
Video GamesAtari 2600 PongScore20.9A2C + SIL
Video GamesAtari 2600 EnduroScore1205.1A2C + SIL
Video GamesAtari 2600 KrullScore10614.6A2C + SIL
Video GamesAtari 2600 BreakoutScore452A2C + SIL
Video GamesAtari 2600 FrostbiteScore6289.8A2C + SIL
Video GamesAtari 2600 Montezuma's RevengeScore1100A2C + SIL
Video GamesAtari 2600 GopherScore23304.2A2C + SIL
Video GamesAtari 2600 Space InvadersScore2951.7A2C + SIL
Video GamesAtari 2600 James BondScore310.8A2C + SIL
Video GamesAtari 2600 AmidarScore1362A2C + SIL
Video GamesAtari 2600 TennisScore-17.3A2C + SIL
Video GamesAtari 2600 Crazy ClimberScore130185.8A2C + SIL
Video GamesAtari 2600 AsteroidsScore2259.4A2C + SIL
Video GamesAtari 2600 GravitarScore1874.2A2C + SIL
Video GamesAtari 2600 Time PilotScore10811.7A2C + SIL
Video GamesAtari 2600 Demon AttackScore10140.5A2C + SIL
Video GamesAtari 2600 Battle ZoneScore25075A2C + SIL
Video GamesAtari 2600 Beam RiderScore2366.2A2C + SIL
Video GamesAtari 2600 AsterixScore17984.2A2C + SIL
Video GamesAtari 2600 Kung-Fu MasterScore34449.2A2C + SIL
Video GamesAtari 2600 BowlingScore31.1A2C + SIL
Video GamesAtari 2600 KangarooScore2888.3A2C + SIL
Video GamesAtari 2600 AssaultScore1812A2C + SIL
Video GamesAtari 2600 AlienScore2242.2A2C + SIL
Video GamesAtari 2600 Fishing DerbyScore55.8A2C + SIL
Video GamesAtari 2600 SeaquestScore2456.5A2C + SIL
Video GamesAtari 2600 Chopper CommandScore6710A2C + SIL
Video GamesAtari 2600 Video PinballScore461522.4A2C + SIL
Video GamesAtari 2600 Wizard of WorScore7088.3A2C + SIL
Video GamesAtari 2600 ZaxxonScore9164.2A2C + SIL
Video GamesAtari 2600 RobotankScore10.5A2C + SIL
Video GamesAtari 2600 Name This GameScore14958.2A2C + SIL
Video GamesAtari 2600 Star GunnerScore31309.2A2C + SIL
Video GamesAtari 2600 Ice HockeyScore-2.4A2C + SIL
Video GamesAtari 2600 AtlantisScore3084781.7A2C + SIL
Video GamesAtari 2600 HEROScore33156.7A2C + SIL
Video GamesAtari 2600 Bank HeistScore1137.8A2C + SIL
Video GamesAtari 2600 Private EyeScore661.2A2C + SIL
Video GamesAtari 2600 Q*BertScore104975.6A2C + SIL
Video GamesAtari 2600 River RaidScore14306.1A2C + SIL
Video GamesAtari 2600 Road RunnerScore57071.7A2C + SIL
Video GamesAtari 2600 Up and DownScore53314.6A2C + SIL

Related Papers

Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner2025-07-17Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)2025-07-17Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound2025-07-15Deep Reinforcement Learning with Gradient Eligibility Traces2025-07-12Safe Domain Randomization via Uncertainty-Aware Out-of-Distribution Detection and Policy Adaptation2025-07-08Detecting and Mitigating Reward Hacking in Reinforcement Learning Systems: A Comprehensive Empirical Study2025-07-08EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow2025-07-08