TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DiverGen: Improving Instance Segmentation by Learning Wide...

DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

Chengxiang Fan, Muzhi Zhu, Hao Chen, Yang Liu, Weijia Wu, Huaqi Zhang, Chunhua Shen

2024-05-16CVPR 2024 1Data AugmentationSemantic SegmentationInstance SegmentationObject Detection
PaperPDFCode(official)

Abstract

Instance segmentation is data-hungry, and as model capacity increases, data scale becomes crucial for improving the accuracy. Most instance segmentation datasets today require costly manual annotation, limiting their data scale. Models trained on such data are prone to overfitting on the training set, especially for those rare categories. While recent works have delved into exploiting generative models to create synthetic datasets for data augmentation, these approaches do not efficiently harness the full potential of generative models. To address these issues, we introduce a more efficient strategy to construct generative datasets for data augmentation, termed DiverGen. Firstly, we provide an explanation of the role of generative data from the perspective of distribution discrepancy. We investigate the impact of different data on the distribution learned by the model. We argue that generative data can expand the data distribution that the model can learn, thus mitigating overfitting. Additionally, we find that the diversity of generative data is crucial for improving model performance and enhance it through various strategies, including category diversity, prompt diversity, and generative model diversity. With these strategies, we can scale the data to millions while maintaining the trend of model performance improvement. On the LVIS dataset, DiverGen significantly outperforms the strong model X-Paste, achieving +1.1 box AP and +1.1 mask AP across all categories, and +1.9 box AP and +2.5 mask AP for rare categories.

Results

TaskDatasetMetricValueModel
Object DetectionLVIS v1.0 valbox AP51.2DiverGen (Swin-L)
Object DetectionLVIS v1.0 valbox APr50.1DiverGen (Swin-L)
3DLVIS v1.0 valbox AP51.2DiverGen (Swin-L)
3DLVIS v1.0 valbox APr50.1DiverGen (Swin-L)
Instance SegmentationLVIS v1.0 valmask AP45.5DiverGen (Swin-L)
Instance SegmentationLVIS v1.0 valmask APr45.8DiverGen (Swin-L)
2D ClassificationLVIS v1.0 valbox AP51.2DiverGen (Swin-L)
2D ClassificationLVIS v1.0 valbox APr50.1DiverGen (Swin-L)
2D Object DetectionLVIS v1.0 valbox AP51.2DiverGen (Swin-L)
2D Object DetectionLVIS v1.0 valbox APr50.1DiverGen (Swin-L)
16kLVIS v1.0 valbox AP51.2DiverGen (Swin-L)
16kLVIS v1.0 valbox APr50.1DiverGen (Swin-L)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17