Olaf Ronneberger, Philipp Fischer, Thomas Brox
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Medical Image Segmentation | Kvasir-SEG | S-Measure | 0.858 | U-Net |
| Medical Image Segmentation | Kvasir-SEG | max E-Measure | 0.893 | U-Net |
| Medical Image Segmentation | Kvasir-SEG | mean Dice | 0.818 | U-Net |
| Medical Image Segmentation | Kvasir-Instrument | DSC | 0.9158 | UNet |
| Medical Image Segmentation | ISBI 2012 EM Segmentation | Warping Error | 0.000353 | U-Net |
| Medical Image Segmentation | CVC-ClinicDB | mean Dice | 0.823 | U-Net |
| Medical Image Segmentation | RITE | Dice | 55.24 | U-Net |
| Medical Image Segmentation | RITE | Jaccard Index | 31.11 | U-Net |
| Medical Image Segmentation | Anatomical Tracings of Lesions After Stroke (ATLAS) | Dice | 0.4606 | U-Net |
| Medical Image Segmentation | Anatomical Tracings of Lesions After Stroke (ATLAS) | IoU | 0.3447 | U-Net |
| Medical Image Segmentation | Anatomical Tracings of Lesions After Stroke (ATLAS) | Precision | 0.5994 | U-Net |
| Medical Image Segmentation | Anatomical Tracings of Lesions After Stroke (ATLAS) | Recall | 0.4449 | U-Net |
| Medical Image Segmentation | STARE | AUC | 0.7756 | U-Net |
| Medical Image Segmentation | Brain MRI segmentation | Dice Score | 0.82 | U-Net |
| Medical Image Segmentation | ROSE-2 | Dice Score | 65.64 | U-Net |
| Medical Image Segmentation | CHASE_DB1 | AUC | 0.9772 | U-Net |
| Medical Image Segmentation | ROSE-1 SVC-DVC | Dice Score | 70.12 | U-Net |
| Medical Image Segmentation | ROSE-1 SVC | Dice Score | 71.16 | U-Net |
| Medical Image Segmentation | STARE | AUC | 0.7783 | U-Net |
| Medical Image Segmentation | STARE | F1 score | 0.8373 | U-Net |
| Medical Image Segmentation | ROSE-1 DVC | Dice Score | 66.05 | U-Net |
| Medical Image Segmentation | DRIVE | AUC | 0.9755 | U-Net |
| Medical Image Segmentation | DRIVE | F1 score | 0.8142 | U-Net |
| Medical Image Segmentation | CT-150 | Dice Score | 0.814 | U-Net |
| Medical Image Segmentation | CT-150 | Precision | 0.848 | U-Net |
| Medical Image Segmentation | CT-150 | Recall | 0.806 | U-Net |
| Medical Image Segmentation | TCIA Pancreas-CT Dataset | Dice Score | 0.82 | U-Net |
| Medical Image Segmentation | SUN-SEG-Easy (Unseen) | Sensitivity | 0.42 | UNet |
| Medical Image Segmentation | SUN-SEG-Hard | Dice | 0.542 | UNet |
| Medical Image Segmentation | SUN-SEG-Hard | S-Measure | 0.67 | UNet |
| Medical Image Segmentation | SUN-SEG-Hard (Unseen) | Sensitivity | 0.429 | UNet |
| Medical Image Segmentation | SUN-SEG-Easy | S measure | 0.669 | UNet |
| Medical Image Segmentation | SUN-SEG-Easy | mean E-measure | 0.677 | UNet |
| Medical Image Segmentation | STARE | AUC | 0.459 | UNet |
| Medical Image Segmentation | LUNA | AUC | 0.9784 | U-Net |
| Medical Image Segmentation | LUNA | F1 score | 0.9658 | U-Net |
| Medical Image Segmentation | Kaggle Skin Lesion Segmentation | AUC | 0.9371 | U-Net |
| Medical Image Segmentation | Kaggle Skin Lesion Segmentation | F1 score | 0.8682 | U-Net |
| Medical Image Segmentation | SNEMI3D | AUC | 0.8676 | U-Net |
| Semantic Segmentation | Kvasir-Instrument | mIoU | 0.8578 | UNet |
| Semantic Segmentation | Fine-Grained Grass Segmentation Dataset | mIoU | 48.17 | UNet |
| Semantic Segmentation | SELMA | mIoU | 36.2 | UNet |
| Semantic Segmentation | Event-based Segmentation Dataset | mIoU | 64.7 | U-Net |
| Semantic Segmentation | UrbanLF | mIoU (Real) | 78.6 | OCR (HRNetV2-W48) |
| Semantic Segmentation | UrbanLF | mIoU (Syn) | 79.36 | OCR (HRNetV2-W48) |
| Semantic Segmentation | SkyScapes-Dense | Mean IoU | 14.15 | U-Net |
| Semantic Segmentation | STARE | AUC | 0.9158 | UNet |
| Semantic Segmentation | BJRoad | IoU | 54.88 | UNet |
| Semantic Segmentation | Trans10K | GFLOPs | 124.55 | U-Net |
| Semantic Segmentation | PST900 | mIoU | 52.8 | UNet |
| Semantic Segmentation | MFN Dataset | mIOU | 45.1 | UNet |
| Semantic Segmentation | CrackVision12K | mIoU | 0.60333 | UNet |
| Object Detection | DIS-TE4 | E-measure | 0.821 | UNet |
| Object Detection | DIS-TE4 | HCE | 3218 | UNet |
| Object Detection | DIS-TE4 | MAE | 0.102 | UNet |
| Object Detection | DIS-TE4 | max F-Measure | 0.759 | UNet |
| Object Detection | DIS-TE4 | weighted F-measure | 0.659 | UNet |
| Object Detection | DIS-VD | E-measure | 0.785 | UNet |
| Object Detection | DIS-VD | HCE | 1337 | UNet |
| Object Detection | DIS-VD | MAE | 0.113 | UNet |
| Object Detection | DIS-VD | S-Measure | 0.745 | UNet |
| Object Detection | DIS-VD | max F-Measure | 0.692 | UNet |
| Object Detection | DIS-VD | weighted F-measure | 0.586 | UNet |
| Object Detection | DIS-TE2 | HCE | 474 | UNet |
| Object Detection | DIS-TE2 | MAE | 0.107 | UNet |
| Object Detection | DIS-TE2 | S-Measure | 0.755 | UNet |
| Object Detection | DIS-TE2 | max F-Measure | 0.703 | UNet |
| Object Detection | DIS-TE2 | weighted F-measure | 0.597 | UNet |
| Object Detection | DIS-TE1 | E-measure | 0.75 | UNet |
| Object Detection | DIS-TE1 | HCE | 233 | UNet |
| Object Detection | DIS-TE1 | MAE | 0.106 | UNet |
| Object Detection | DIS-TE1 | S-Measure | 0.716 | UNet |
| Object Detection | DIS-TE1 | max F-Measure | 0.625 | UNet |
| Object Detection | DIS-TE1 | weighted F-measure | 0.514 | UNet |
| Object Detection | DIS-TE3 | HCE | 883 | UNet |
| Object Detection | DIS-TE3 | MAE | 0.098 | UNet |
| Object Detection | DIS-TE3 | max F-Measure | 0.748 | UNet |
| Object Detection | DIS-TE3 | weighted F-measure | 0.644 | UNet |
| Object Detection | STARE | AUC | 0.78 | UNet |
| 3D | DIS-TE4 | E-measure | 0.821 | UNet |
| 3D | DIS-TE4 | HCE | 3218 | UNet |
| 3D | DIS-TE4 | MAE | 0.102 | UNet |
| 3D | DIS-TE4 | max F-Measure | 0.759 | UNet |
| 3D | DIS-TE4 | weighted F-measure | 0.659 | UNet |
| 3D | DIS-VD | E-measure | 0.785 | UNet |
| 3D | DIS-VD | HCE | 1337 | UNet |
| 3D | DIS-VD | MAE | 0.113 | UNet |
| 3D | DIS-VD | S-Measure | 0.745 | UNet |
| 3D | DIS-VD | max F-Measure | 0.692 | UNet |
| 3D | DIS-VD | weighted F-measure | 0.586 | UNet |
| 3D | DIS-TE2 | HCE | 474 | UNet |
| 3D | DIS-TE2 | MAE | 0.107 | UNet |
| 3D | DIS-TE2 | S-Measure | 0.755 | UNet |
| 3D | DIS-TE2 | max F-Measure | 0.703 | UNet |
| 3D | DIS-TE2 | weighted F-measure | 0.597 | UNet |
| 3D | DIS-TE1 | E-measure | 0.75 | UNet |
| 3D | DIS-TE1 | HCE | 233 | UNet |
| 3D | DIS-TE1 | MAE | 0.106 | UNet |
| 3D | DIS-TE1 | S-Measure | 0.716 | UNet |
| 3D | DIS-TE1 | max F-Measure | 0.625 | UNet |
| 3D | DIS-TE1 | weighted F-measure | 0.514 | UNet |
| 3D | DIS-TE3 | HCE | 883 | UNet |
| 3D | DIS-TE3 | MAE | 0.098 | UNet |
| 3D | DIS-TE3 | max F-Measure | 0.748 | UNet |
| 3D | DIS-TE3 | weighted F-measure | 0.644 | UNet |
| 3D | STARE | AUC | 0.78 | UNet |
| RGB Salient Object Detection | DIS-TE4 | E-measure | 0.821 | UNet |
| RGB Salient Object Detection | DIS-TE4 | HCE | 3218 | UNet |
| RGB Salient Object Detection | DIS-TE4 | MAE | 0.102 | UNet |
| RGB Salient Object Detection | DIS-TE4 | max F-Measure | 0.759 | UNet |
| RGB Salient Object Detection | DIS-TE4 | weighted F-measure | 0.659 | UNet |
| RGB Salient Object Detection | DIS-VD | E-measure | 0.785 | UNet |
| RGB Salient Object Detection | DIS-VD | HCE | 1337 | UNet |
| RGB Salient Object Detection | DIS-VD | MAE | 0.113 | UNet |
| RGB Salient Object Detection | DIS-VD | S-Measure | 0.745 | UNet |
| RGB Salient Object Detection | DIS-VD | max F-Measure | 0.692 | UNet |
| RGB Salient Object Detection | DIS-VD | weighted F-measure | 0.586 | UNet |
| RGB Salient Object Detection | DIS-TE2 | HCE | 474 | UNet |
| RGB Salient Object Detection | DIS-TE2 | MAE | 0.107 | UNet |
| RGB Salient Object Detection | DIS-TE2 | S-Measure | 0.755 | UNet |
| RGB Salient Object Detection | DIS-TE2 | max F-Measure | 0.703 | UNet |
| RGB Salient Object Detection | DIS-TE2 | weighted F-measure | 0.597 | UNet |
| RGB Salient Object Detection | DIS-TE1 | E-measure | 0.75 | UNet |
| RGB Salient Object Detection | DIS-TE1 | HCE | 233 | UNet |
| RGB Salient Object Detection | DIS-TE1 | MAE | 0.106 | UNet |
| RGB Salient Object Detection | DIS-TE1 | S-Measure | 0.716 | UNet |
| RGB Salient Object Detection | DIS-TE1 | max F-Measure | 0.625 | UNet |
| RGB Salient Object Detection | DIS-TE1 | weighted F-measure | 0.514 | UNet |
| RGB Salient Object Detection | DIS-TE3 | HCE | 883 | UNet |
| RGB Salient Object Detection | DIS-TE3 | MAE | 0.098 | UNet |
| RGB Salient Object Detection | DIS-TE3 | max F-Measure | 0.748 | UNet |
| RGB Salient Object Detection | DIS-TE3 | weighted F-measure | 0.644 | UNet |
| RGB Salient Object Detection | STARE | AUC | 0.78 | UNet |
| Colorectal Gland Segmentation: | CRAG | Dice | 0.844 | U-Net (e) |
| Colorectal Gland Segmentation: | CRAG | Hausdorff Distance (mm) | 196.9 | U-Net (e) |
| Colorectal Gland Segmentation: | CRAG | Hausdorff Distance (mm) | 199.5 | FCN8 (e) |
| Colorectal Gland Segmentation: | STARE | AUC | 0.835 | U-Net |
| Colorectal Gland Segmentation: | STARE | AUC | 0.827 | U-Net (e) |
| Colorectal Gland Segmentation: | STARE | AUC | 0.796 | FCN8 (e) |
| Multi-tissue Nucleus Segmentation | Kumar | Dice | 0.758 | U-Net (e) |
| Multi-tissue Nucleus Segmentation | Kumar | Hausdorff Distance (mm) | 47.8 | U-Net (e) |
| 3D Medical Imaging Segmentation | CT-150 | Dice Score | 0.814 | U-Net |
| 3D Medical Imaging Segmentation | CT-150 | Precision | 0.848 | U-Net |
| 3D Medical Imaging Segmentation | CT-150 | Recall | 0.806 | U-Net |
| 3D Medical Imaging Segmentation | TCIA Pancreas-CT Dataset | Dice Score | 0.82 | U-Net |
| 2D Classification | DIS-TE4 | E-measure | 0.821 | UNet |
| 2D Classification | DIS-TE4 | HCE | 3218 | UNet |
| 2D Classification | DIS-TE4 | MAE | 0.102 | UNet |
| 2D Classification | DIS-TE4 | max F-Measure | 0.759 | UNet |
| 2D Classification | DIS-TE4 | weighted F-measure | 0.659 | UNet |
| 2D Classification | DIS-VD | E-measure | 0.785 | UNet |
| 2D Classification | DIS-VD | HCE | 1337 | UNet |
| 2D Classification | DIS-VD | MAE | 0.113 | UNet |
| 2D Classification | DIS-VD | S-Measure | 0.745 | UNet |
| 2D Classification | DIS-VD | max F-Measure | 0.692 | UNet |
| 2D Classification | DIS-VD | weighted F-measure | 0.586 | UNet |
| 2D Classification | DIS-TE2 | HCE | 474 | UNet |
| 2D Classification | DIS-TE2 | MAE | 0.107 | UNet |
| 2D Classification | DIS-TE2 | S-Measure | 0.755 | UNet |
| 2D Classification | DIS-TE2 | max F-Measure | 0.703 | UNet |
| 2D Classification | DIS-TE2 | weighted F-measure | 0.597 | UNet |
| 2D Classification | DIS-TE1 | E-measure | 0.75 | UNet |
| 2D Classification | DIS-TE1 | HCE | 233 | UNet |
| 2D Classification | DIS-TE1 | MAE | 0.106 | UNet |
| 2D Classification | DIS-TE1 | S-Measure | 0.716 | UNet |
| 2D Classification | DIS-TE1 | max F-Measure | 0.625 | UNet |
| 2D Classification | DIS-TE1 | weighted F-measure | 0.514 | UNet |
| 2D Classification | DIS-TE3 | HCE | 883 | UNet |
| 2D Classification | DIS-TE3 | MAE | 0.098 | UNet |
| 2D Classification | DIS-TE3 | max F-Measure | 0.748 | UNet |
| 2D Classification | DIS-TE3 | weighted F-measure | 0.644 | UNet |
| 2D Classification | STARE | AUC | 0.78 | UNet |
| Scene Segmentation | PST900 | mIoU | 52.8 | UNet |
| Scene Segmentation | MFN Dataset | mIOU | 45.1 | UNet |
| 2D Object Detection | DIS-TE4 | E-measure | 0.821 | UNet |
| 2D Object Detection | DIS-TE4 | HCE | 3218 | UNet |
| 2D Object Detection | DIS-TE4 | MAE | 0.102 | UNet |
| 2D Object Detection | DIS-TE4 | max F-Measure | 0.759 | UNet |
| 2D Object Detection | DIS-TE4 | weighted F-measure | 0.659 | UNet |
| 2D Object Detection | DIS-VD | E-measure | 0.785 | UNet |
| 2D Object Detection | DIS-VD | HCE | 1337 | UNet |
| 2D Object Detection | DIS-VD | MAE | 0.113 | UNet |
| 2D Object Detection | DIS-VD | S-Measure | 0.745 | UNet |
| 2D Object Detection | DIS-VD | max F-Measure | 0.692 | UNet |
| 2D Object Detection | DIS-VD | weighted F-measure | 0.586 | UNet |
| 2D Object Detection | DIS-TE2 | HCE | 474 | UNet |
| 2D Object Detection | DIS-TE2 | MAE | 0.107 | UNet |
| 2D Object Detection | DIS-TE2 | S-Measure | 0.755 | UNet |
| 2D Object Detection | DIS-TE2 | max F-Measure | 0.703 | UNet |
| 2D Object Detection | DIS-TE2 | weighted F-measure | 0.597 | UNet |
| 2D Object Detection | DIS-TE1 | E-measure | 0.75 | UNet |
| 2D Object Detection | DIS-TE1 | HCE | 233 | UNet |
| 2D Object Detection | DIS-TE1 | MAE | 0.106 | UNet |
| 2D Object Detection | DIS-TE1 | S-Measure | 0.716 | UNet |
| 2D Object Detection | DIS-TE1 | max F-Measure | 0.625 | UNet |
| 2D Object Detection | DIS-TE1 | weighted F-measure | 0.514 | UNet |
| 2D Object Detection | DIS-TE3 | HCE | 883 | UNet |
| 2D Object Detection | DIS-TE3 | MAE | 0.098 | UNet |
| 2D Object Detection | DIS-TE3 | max F-Measure | 0.748 | UNet |
| 2D Object Detection | DIS-TE3 | weighted F-measure | 0.644 | UNet |
| 2D Object Detection | STARE | AUC | 0.78 | UNet |
| 2D Object Detection | PST900 | mIoU | 52.8 | UNet |
| 2D Object Detection | MFN Dataset | mIOU | 45.1 | UNet |
| 10-shot image generation | Kvasir-Instrument | mIoU | 0.8578 | UNet |
| 10-shot image generation | Fine-Grained Grass Segmentation Dataset | mIoU | 48.17 | UNet |
| 10-shot image generation | SELMA | mIoU | 36.2 | UNet |
| 10-shot image generation | Event-based Segmentation Dataset | mIoU | 64.7 | U-Net |
| 10-shot image generation | UrbanLF | mIoU (Real) | 78.6 | OCR (HRNetV2-W48) |
| 10-shot image generation | UrbanLF | mIoU (Syn) | 79.36 | OCR (HRNetV2-W48) |
| 10-shot image generation | SkyScapes-Dense | Mean IoU | 14.15 | U-Net |
| 10-shot image generation | STARE | AUC | 0.9158 | UNet |
| 10-shot image generation | BJRoad | IoU | 54.88 | UNet |
| 10-shot image generation | Trans10K | GFLOPs | 124.55 | U-Net |
| 10-shot image generation | PST900 | mIoU | 52.8 | UNet |
| 10-shot image generation | MFN Dataset | mIOU | 45.1 | UNet |
| 10-shot image generation | CrackVision12K | mIoU | 0.60333 | UNet |
| Retinal Vessel Segmentation | ROSE-2 | Dice Score | 65.64 | U-Net |
| Retinal Vessel Segmentation | CHASE_DB1 | AUC | 0.9772 | U-Net |
| Retinal Vessel Segmentation | ROSE-1 SVC-DVC | Dice Score | 70.12 | U-Net |
| Retinal Vessel Segmentation | ROSE-1 SVC | Dice Score | 71.16 | U-Net |
| Retinal Vessel Segmentation | STARE | AUC | 0.7783 | U-Net |
| Retinal Vessel Segmentation | STARE | F1 score | 0.8373 | U-Net |
| Retinal Vessel Segmentation | ROSE-1 DVC | Dice Score | 66.05 | U-Net |
| Retinal Vessel Segmentation | DRIVE | AUC | 0.9755 | U-Net |
| Retinal Vessel Segmentation | DRIVE | F1 score | 0.8142 | U-Net |
| 16k | DIS-TE4 | E-measure | 0.821 | UNet |
| 16k | DIS-TE4 | HCE | 3218 | UNet |
| 16k | DIS-TE4 | MAE | 0.102 | UNet |
| 16k | DIS-TE4 | max F-Measure | 0.759 | UNet |
| 16k | DIS-TE4 | weighted F-measure | 0.659 | UNet |
| 16k | DIS-VD | E-measure | 0.785 | UNet |
| 16k | DIS-VD | HCE | 1337 | UNet |
| 16k | DIS-VD | MAE | 0.113 | UNet |
| 16k | DIS-VD | S-Measure | 0.745 | UNet |
| 16k | DIS-VD | max F-Measure | 0.692 | UNet |
| 16k | DIS-VD | weighted F-measure | 0.586 | UNet |
| 16k | DIS-TE2 | HCE | 474 | UNet |
| 16k | DIS-TE2 | MAE | 0.107 | UNet |
| 16k | DIS-TE2 | S-Measure | 0.755 | UNet |
| 16k | DIS-TE2 | max F-Measure | 0.703 | UNet |
| 16k | DIS-TE2 | weighted F-measure | 0.597 | UNet |
| 16k | DIS-TE1 | E-measure | 0.75 | UNet |
| 16k | DIS-TE1 | HCE | 233 | UNet |
| 16k | DIS-TE1 | MAE | 0.106 | UNet |
| 16k | DIS-TE1 | S-Measure | 0.716 | UNet |
| 16k | DIS-TE1 | max F-Measure | 0.625 | UNet |
| 16k | DIS-TE1 | weighted F-measure | 0.514 | UNet |
| 16k | DIS-TE3 | HCE | 883 | UNet |
| 16k | DIS-TE3 | MAE | 0.098 | UNet |
| 16k | DIS-TE3 | max F-Measure | 0.748 | UNet |
| 16k | DIS-TE3 | weighted F-measure | 0.644 | UNet |
| 16k | STARE | AUC | 0.78 | UNet |
| Cell Segmentation | STARE | AUC | 0.7756 | U-Net |