TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/LCM

LCM

Reported on 38 benchmarks across 7 tasks · 3 papers · 4 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision26 results

  • VideoonDAVIS 2017 (test-dev)
    F-measure· 2022-08-01
    81.8
    best: 86.1 (BATMAN)
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonDAVIS 2017 (test-dev)
    Jaccard· 2022-08-01
    74.4
    best: 78.4 (BATMAN)
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonDAVIS 2017 (test-dev)
    Mean Jaccard & F-Measure· 2022-08-01
    78.1
    best: 82.2 (BATMAN)
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonYouTube-VOS 2018
    Jaccard (Seen)· 2022-08-01
    82.2
    best: 86.6 (Cutie+ (base, MEGA))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonYouTube-VOS 2018
    Mean Jaccard & F-Measure· 2022-08-01
    82
    best: 86.9 (XMem (BL30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonDAVIS 2017 (val)
    F-measure· 2022-08-01
    86.5
    best: 92.6 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonDAVIS 2017 (val)
    Jaccard· 2022-08-01
    80.5
    best: 86.3 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonDAVIS 2017 (test-dev)
    F-measure· 2022-08-01
    81.8
    best: 86.1 (BATMAN)
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonDAVIS 2017 (test-dev)
    Jaccard· 2022-08-01
    74.4
    best: 78.4 (BATMAN)
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonDAVIS 2017 (test-dev)
    Mean Jaccard & F-Measure· 2022-08-01
    78.1
    best: 82.2 (BATMAN)
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonYouTube-VOS 2018
    Jaccard (Seen)· 2022-08-01
    82.2
    best: 86.6 (Cutie+ (base, MEGA))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonYouTube-VOS 2018
    Mean Jaccard & F-Measure· 2022-08-01
    82
    best: 86.9 (XMem (BL30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonDAVIS 2017 (val)
    F-measure· 2022-08-01
    86.5
    best: 92.6 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonDAVIS 2017 (val)
    Jaccard· 2022-08-01
    80.5
    best: 86.3 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonDAVIS (no YouTube-VOS training)
    D17 val (F)· 2021-04-09
    77.2
    best: 83.1 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • VideoonDAVIS (no YouTube-VOS training)
    D17 val (G)· 2021-04-09
    75.2
    best: 80.4 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • VideoonDAVIS (no YouTube-VOS training)
    D17 val (J)· 2021-04-09
    73.1
    best: 77.7 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • VideoonDAVIS (no YouTube-VOS training)
    FPS· 2021-04-09
    8.47
    best: 50.1 (TBD)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • Video Object SegmentationonDAVIS (no YouTube-VOS training)
    D17 val (F)· 2021-04-09
    77.2
    best: 83.1 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • Video Object SegmentationonDAVIS (no YouTube-VOS training)
    D17 val (G)· 2021-04-09
    75.2
    best: 80.4 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • Video Object SegmentationonDAVIS (no YouTube-VOS training)
    D17 val (J)· 2021-04-09
    73.1
    best: 77.7 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • Video Object SegmentationonDAVIS (no YouTube-VOS training)
    FPS· 2021-04-09
    8.47
    best: 50.1 (TBD)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • Semi-Supervised Video Object SegmentationonDAVIS (no YouTube-VOS training)
    D17 val (F)· 2021-04-09
    77.2
    best: 83.1 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • Semi-Supervised Video Object SegmentationonDAVIS (no YouTube-VOS training)
    D17 val (G)· 2021-04-09
    75.2
    best: 80.4 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • Semi-Supervised Video Object SegmentationonDAVIS (no YouTube-VOS training)
    D17 val (J)· 2021-04-09
    73.1
    best: 77.7 (HMMN)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329
  • Semi-Supervised Video Object SegmentationonDAVIS (no YouTube-VOS training)
    FPS· 2021-04-09
    8.47
    best: 50.1 (TBD)
    Learning Position and Target Consistency for Memory-based Video Object SegmentationarXiv:2104.04329

Audio6 results

  • 10-shot image generationonDrawBench
    Human Preference Alignement (HPSv2)· 2023-10-06
    0.261
    SOTA
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • 1 Image, 2*2 StitchionDrawBench
    Human Preference Alignement (HPSv2)· 2023-10-06
    0.261
    SOTA
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • 10-shot image generationonDrawBench
    Aesthetics (Laion Aesthtetics Predictor)· 2023-10-06
    5.8038
    best: 6.1829 (LCM (Curriculum DPO))
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • 10-shot image generationonDrawBench
    Text Alignement (SentenceBERT)· 2023-10-06
    0.5602
    best: 0.6234 (Stable Diffusion 1.5 (Curriculum DPO))
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • 1 Image, 2*2 StitchionDrawBench
    Aesthetics (Laion Aesthtetics Predictor)· 2023-10-06
    5.8038
    best: 6.1829 (LCM (Curriculum DPO))
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • 1 Image, 2*2 StitchionDrawBench
    Text Alignement (SentenceBERT)· 2023-10-06
    0.5602
    best: 0.6234 (Stable Diffusion 1.5 (Curriculum DPO))
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378

Medical3 results

  • Image GenerationonDrawBench
    Human Preference Alignement (HPSv2)· 2023-10-06
    0.261
    SOTA
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • Image GenerationonDrawBench
    Aesthetics (Laion Aesthtetics Predictor)· 2023-10-06
    5.8038
    best: 6.1829 (LCM (Curriculum DPO))
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • Image GenerationonDrawBench
    Text Alignement (SentenceBERT)· 2023-10-06
    0.5602
    best: 0.6234 (Stable Diffusion 1.5 (Curriculum DPO))
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378

Natural Language Processing3 results

  • Text-to-Image GenerationonDrawBench
    Human Preference Alignement (HPSv2)· 2023-10-06
    0.261
    SOTA
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • Text-to-Image GenerationonDrawBench
    Aesthetics (Laion Aesthtetics Predictor)· 2023-10-06
    5.8038
    best: 6.1829 (LCM (Curriculum DPO))
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378
  • Text-to-Image GenerationonDrawBench
    Text Alignement (SentenceBERT)· 2023-10-06
    0.5602
    best: 0.6234 (Stable Diffusion 1.5 (Curriculum DPO))
    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferencearXiv:2310.04378