TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/YOLOv11: An Overview of the Key Architectural Enhancements

YOLOv11: An Overview of the Key Architectural Enhancements

Rahima Khanam, Muhammad Hussain

2024-10-23Real-Time Object DetectionSemantic SegmentationPose EstimationInstance SegmentationOriented Object Detectionobject-detectionObject Detection
PaperPDFCode

Abstract

This study presents an architectural analysis of YOLOv11, the latest iteration in the YOLO (You Only Look Once) series of object detection models. We examine the models architectural innovations, including the introduction of the C3k2 (Cross Stage Partial with kernel size 2) block, SPPF (Spatial Pyramid Pooling - Fast), and C2PSA (Convolutional block with Parallel Spatial Attention) components, which contribute in improving the models performance in several ways such as enhanced feature extraction. The paper explores YOLOv11's expanded capabilities across various computer vision tasks, including object detection, instance segmentation, pose estimation, and oriented object detection (OBB). We review the model's performance improvements in terms of mean Average Precision (mAP) and computational efficiency compared to its predecessors, with a focus on the trade-off between parameter count and accuracy. Additionally, the study discusses YOLOv11's versatility across different model sizes, from nano to extra-large, catering to diverse application needs from edge devices to high-performance computing environments. Our research provides insights into YOLOv11's position within the broader landscape of object detection and its potential impact on real-time computer vision applications.

Results

TaskDatasetMetricValueModel
Object DetectionCOCO (Common Objects in Context)box AP54.7YOLOv11x
Object DetectionCOCO (Common Objects in Context)box AP53.4YOLOv11l
Object DetectionCOCO (Common Objects in Context)box AP51.5YOLOv11m
Object DetectionCOCO (Common Objects in Context)box AP47YOLOv11s
Object DetectionCOCO (Common Objects in Context)box AP39.5YOLOv11n
3DCOCO (Common Objects in Context)box AP54.7YOLOv11x
3DCOCO (Common Objects in Context)box AP53.4YOLOv11l
3DCOCO (Common Objects in Context)box AP51.5YOLOv11m
3DCOCO (Common Objects in Context)box AP47YOLOv11s
3DCOCO (Common Objects in Context)box AP39.5YOLOv11n
2D ClassificationCOCO (Common Objects in Context)box AP54.7YOLOv11x
2D ClassificationCOCO (Common Objects in Context)box AP53.4YOLOv11l
2D ClassificationCOCO (Common Objects in Context)box AP51.5YOLOv11m
2D ClassificationCOCO (Common Objects in Context)box AP47YOLOv11s
2D ClassificationCOCO (Common Objects in Context)box AP39.5YOLOv11n
2D Object DetectionCOCO (Common Objects in Context)box AP54.7YOLOv11x
2D Object DetectionCOCO (Common Objects in Context)box AP53.4YOLOv11l
2D Object DetectionCOCO (Common Objects in Context)box AP51.5YOLOv11m
2D Object DetectionCOCO (Common Objects in Context)box AP47YOLOv11s
2D Object DetectionCOCO (Common Objects in Context)box AP39.5YOLOv11n
16kCOCO (Common Objects in Context)box AP54.7YOLOv11x
16kCOCO (Common Objects in Context)box AP53.4YOLOv11l
16kCOCO (Common Objects in Context)box AP51.5YOLOv11m
16kCOCO (Common Objects in Context)box AP47YOLOv11s
16kCOCO (Common Objects in Context)box AP39.5YOLOv11n

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17