Barış Batuhan Topal, Deniz Yuret, Tevfik Metin Sezgin
Drawings are powerful means of pictorial abstraction and communication. Understanding diverse forms of drawings, including digital arts, cartoons, and comics, has been a major problem of interest for the computer vision and computer graphics communities. Although there are large amounts of digitized drawings from comic books and cartoons, they contain vast stylistic variations, which necessitate expensive manual labeling for training domain-specific recognizers. In this work, we show how self-supervised learning, based on a teacher-student network with a modified student network update design, can be used to build face and body detectors. Our setup allows exploiting large amounts of unlabeled data from the target domain when labels are provided for only a small subset of it. We further demonstrate that style transfer can be incorporated into our learning pipeline to bootstrap detectors using a vast amount of out-of-domain labeled images from natural images (i.e., images from the real world). Our combined architecture yields detectors with state-of-the-art (SOTA) and near-SOTA performance using minimal annotation effort. Our code can be accessed from https://github.com/barisbatuhan/DASS_Detector.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Facial Recognition and Modelling | Manga109 | Average Precision | 87.88 | DASS-Detector (YOLOX XL) |
| Facial Recognition and Modelling | iCartoonFace | Average Precision | 90.01 | DASS-Detector (YOLOX XL) |
| Facial Recognition and Modelling | iCartoonFace | Average Precision | 87.75 | DASS-Detector (YOLOX Tiny) |
| Facial Recognition and Modelling | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX XL) |
| Facial Recognition and Modelling | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX Tiny) |
| Object Detection | Manga109 | Average Precision | 87.93 | DASS-Detector (YOLOX XL) |
| Object Detection | Manga109 | Average Precision | 87.46 | DASS-Detector (YOLOX Tiny) |
| Object Detection | Comic2k | MAP | 67.41 | DASS-Detector (YOLOX Tiny) |
| Object Detection | Watercolor2k | MAP | 71.53 | DASS-Detector (YOLOX Tiny) |
| Object Detection | Clipart1k | MAP | 64.25 | DASS-Detector (YOLOX Tiny) |
| Object Detection | Manga109 | Average Precision | 87.98 | DASS-Detector (YOLOX XL) |
| Object Detection | DCM | Average Precision | 86.14 | DASS-Detector (YOLOX XL) |
| Object Detection | DCM | Average Precision | 87.06 | DASS-Detector (YOLOX Tiny) |
| Object Detection | Clipart1k | MAP | 83.59 | DASS-Detector (YOLOX XL) |
| Object Detection | Watercolor2k | MAP | 89.81 | DASS-Detector (YOLOX XL) |
| Object Detection | Comic2k | MAP | 73.65 | DASS-Detector (YOLOX XL) |
| Face Detection | Manga109 | Average Precision | 87.88 | DASS-Detector (YOLOX XL) |
| Face Detection | iCartoonFace | Average Precision | 90.01 | DASS-Detector (YOLOX XL) |
| Face Detection | iCartoonFace | Average Precision | 87.75 | DASS-Detector (YOLOX Tiny) |
| Face Detection | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX XL) |
| Face Detection | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX Tiny) |
| Face Reconstruction | Manga109 | Average Precision | 87.88 | DASS-Detector (YOLOX XL) |
| Face Reconstruction | iCartoonFace | Average Precision | 90.01 | DASS-Detector (YOLOX XL) |
| Face Reconstruction | iCartoonFace | Average Precision | 87.75 | DASS-Detector (YOLOX Tiny) |
| Face Reconstruction | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX XL) |
| Face Reconstruction | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX Tiny) |
| 3D | Manga109 | Average Precision | 87.93 | DASS-Detector (YOLOX XL) |
| 3D | Manga109 | Average Precision | 87.46 | DASS-Detector (YOLOX Tiny) |
| 3D | Comic2k | MAP | 67.41 | DASS-Detector (YOLOX Tiny) |
| 3D | Watercolor2k | MAP | 71.53 | DASS-Detector (YOLOX Tiny) |
| 3D | Clipart1k | MAP | 64.25 | DASS-Detector (YOLOX Tiny) |
| 3D | Manga109 | Average Precision | 87.98 | DASS-Detector (YOLOX XL) |
| 3D | DCM | Average Precision | 86.14 | DASS-Detector (YOLOX XL) |
| 3D | DCM | Average Precision | 87.06 | DASS-Detector (YOLOX Tiny) |
| 3D | Clipart1k | MAP | 83.59 | DASS-Detector (YOLOX XL) |
| 3D | Watercolor2k | MAP | 89.81 | DASS-Detector (YOLOX XL) |
| 3D | Comic2k | MAP | 73.65 | DASS-Detector (YOLOX XL) |
| 3D | Manga109 | Average Precision | 87.88 | DASS-Detector (YOLOX XL) |
| 3D | iCartoonFace | Average Precision | 90.01 | DASS-Detector (YOLOX XL) |
| 3D | iCartoonFace | Average Precision | 87.75 | DASS-Detector (YOLOX Tiny) |
| 3D | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX XL) |
| 3D | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX Tiny) |
| 3D Face Modelling | Manga109 | Average Precision | 87.88 | DASS-Detector (YOLOX XL) |
| 3D Face Modelling | iCartoonFace | Average Precision | 90.01 | DASS-Detector (YOLOX XL) |
| 3D Face Modelling | iCartoonFace | Average Precision | 87.75 | DASS-Detector (YOLOX Tiny) |
| 3D Face Modelling | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX XL) |
| 3D Face Modelling | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX Tiny) |
| 3D Face Reconstruction | Manga109 | Average Precision | 87.88 | DASS-Detector (YOLOX XL) |
| 3D Face Reconstruction | iCartoonFace | Average Precision | 90.01 | DASS-Detector (YOLOX XL) |
| 3D Face Reconstruction | iCartoonFace | Average Precision | 87.75 | DASS-Detector (YOLOX Tiny) |
| 3D Face Reconstruction | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX XL) |
| 3D Face Reconstruction | DCM | Average Precision | 77.4 | DASS-Detector (YOLOX Tiny) |
| 2D Classification | Manga109 | Average Precision | 87.93 | DASS-Detector (YOLOX XL) |
| 2D Classification | Manga109 | Average Precision | 87.46 | DASS-Detector (YOLOX Tiny) |
| 2D Classification | Comic2k | MAP | 67.41 | DASS-Detector (YOLOX Tiny) |
| 2D Classification | Watercolor2k | MAP | 71.53 | DASS-Detector (YOLOX Tiny) |
| 2D Classification | Clipart1k | MAP | 64.25 | DASS-Detector (YOLOX Tiny) |
| 2D Classification | Manga109 | Average Precision | 87.98 | DASS-Detector (YOLOX XL) |
| 2D Classification | DCM | Average Precision | 86.14 | DASS-Detector (YOLOX XL) |
| 2D Classification | DCM | Average Precision | 87.06 | DASS-Detector (YOLOX Tiny) |
| 2D Classification | Clipart1k | MAP | 83.59 | DASS-Detector (YOLOX XL) |
| 2D Classification | Watercolor2k | MAP | 89.81 | DASS-Detector (YOLOX XL) |
| 2D Classification | Comic2k | MAP | 73.65 | DASS-Detector (YOLOX XL) |
| 2D Object Detection | Manga109 | Average Precision | 87.93 | DASS-Detector (YOLOX XL) |
| 2D Object Detection | Manga109 | Average Precision | 87.46 | DASS-Detector (YOLOX Tiny) |
| 2D Object Detection | Comic2k | MAP | 67.41 | DASS-Detector (YOLOX Tiny) |
| 2D Object Detection | Watercolor2k | MAP | 71.53 | DASS-Detector (YOLOX Tiny) |
| 2D Object Detection | Clipart1k | MAP | 64.25 | DASS-Detector (YOLOX Tiny) |
| 2D Object Detection | Manga109 | Average Precision | 87.98 | DASS-Detector (YOLOX XL) |
| 2D Object Detection | DCM | Average Precision | 86.14 | DASS-Detector (YOLOX XL) |
| 2D Object Detection | DCM | Average Precision | 87.06 | DASS-Detector (YOLOX Tiny) |
| 2D Object Detection | Clipart1k | MAP | 83.59 | DASS-Detector (YOLOX XL) |
| 2D Object Detection | Watercolor2k | MAP | 89.81 | DASS-Detector (YOLOX XL) |
| 2D Object Detection | Comic2k | MAP | 73.65 | DASS-Detector (YOLOX XL) |
| 16k | Manga109 | Average Precision | 87.93 | DASS-Detector (YOLOX XL) |
| 16k | Manga109 | Average Precision | 87.46 | DASS-Detector (YOLOX Tiny) |
| 16k | Comic2k | MAP | 67.41 | DASS-Detector (YOLOX Tiny) |
| 16k | Watercolor2k | MAP | 71.53 | DASS-Detector (YOLOX Tiny) |
| 16k | Clipart1k | MAP | 64.25 | DASS-Detector (YOLOX Tiny) |
| 16k | Manga109 | Average Precision | 87.98 | DASS-Detector (YOLOX XL) |
| 16k | DCM | Average Precision | 86.14 | DASS-Detector (YOLOX XL) |
| 16k | DCM | Average Precision | 87.06 | DASS-Detector (YOLOX Tiny) |
| 16k | Clipart1k | MAP | 83.59 | DASS-Detector (YOLOX XL) |
| 16k | Watercolor2k | MAP | 89.81 | DASS-Detector (YOLOX XL) |
| 16k | Comic2k | MAP | 73.65 | DASS-Detector (YOLOX XL) |