Thanh-Tung Nguyen, Viktor Schlegel, Abhinav Kashyap, Stefan Winkler, Shao-Syuan Huang, Jie-Jyun Liu, Chih-Jen Lin
Clinical notes are assigned ICD codes - sets of codes for diagnoses and procedures. In the recent years, predictive machine learning models have been built for automatic ICD coding. However, there is a lack of widely accepted benchmarks for automated ICD coding models based on large-scale public EHR data. This paper proposes a public benchmark suite for ICD-10 coding using a large EHR dataset derived from MIMIC-IV, the most recent public EHR dataset. We implement and compare several popular methods for ICD coding prediction tasks to standardize data preprocessing and establish a comprehensive ICD coding benchmark dataset. This approach fosters reproducibility and model comparison, accelerating progress toward employing automated ICD coding in future studies. Furthermore, we create a new ICD-9 benchmark using MIMIC-IV data, providing more data points and a higher number of ICD codes than MIMIC-III. Our open-source code offers easy access to data processing steps, benchmark creation, and experiment replication for those with MIMIC-IV access, providing insights, guidance, and protocols to efficiently develop ICD coding models.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Macro) | 93.6 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Micro) | 95.61 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (macro) | 69.01 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (micro) | 74.15 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | Precision@5 | 65.16 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Macro) | 93.37 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Micro) | 95.69 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (macro) | 70.31 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (micro) | 73.27 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | Precision@5 | 64.57 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Macro) | 93.39 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Micro) | 95.57 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (macro) | 68.41 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (micro) | 72.85 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | Precision@5 | 64.49 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Macro) | 93.21 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Micro) | 95.49 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (macro) | 68.15 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (micro) | 72.56 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | Precision@5 | 64.39 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Macro) | 91.05 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | AUC (Micro) | 93.18 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (macro) | 64.3 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | F1 (micro) | 67.56 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD10-top50 | Precision@5 | 59.58 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Macro | 13.94 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Micro | 61.15 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Macro AUC | 96.79 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Micro AUC | 99.56 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Precision@8 | 68.89 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Macro | 14.4 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Micro | 62.45 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Macro AUC | 96.61 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Micro AUC | 99.53 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Precision@8 | 70.34 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Macro | 14.17 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Micro | 60.37 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Macro AUC | 95.57 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Micro AUC | 99.49 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Precision@8 | 67.46 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Macro | 13.12 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Micro | 60.31 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Macro AUC | 95.18 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Micro AUC | 99.47 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Precision@8 | 67.47 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Macro | 11.06 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-full | F1 Micro | 57.28 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Macro AUC | 93.45 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Micro AUC | 99.29 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-full | Precision@8 | 64.91 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-AUC | 97.07 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-F1 | 5.42 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-AUC | 99.61 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-F1 | 55.91 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Precision@8 | 67.66 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-AUC | 93.64 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-F1 | 5.71 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-AUC | 99.27 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-F1 | 55.89 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Precision@8 | 66.89 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-AUC | 92.96 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-F1 | 4.47 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-AUC | 99.14 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-F1 | 55.4 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Precision@8 | 6697 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-AUC | 91.85 | PLM |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-F1 | 4.9 | PLM |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-AUC | 99.02 | PLM |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-F1 | 56.95 | PLM |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Precision@8 | 69.47 | PLM |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-AUC | 89.91 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Macro-F1 | 4.07 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-AUC | 98.79 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Micro-F1 | 52.67 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD-10-full | Precision@8 | 64.43 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Macro | 95.13 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Micro | 96.46 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Macro | 71.85 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Micro | 75.78 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | Precision @5 | 62.6 | MSMN |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Macro | 94.97 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Micro | 96.41 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Macro | 71.35 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Micro | 75.46 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | Precision @5 | 62.44 | PLM-ICD |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Macro | 94.92 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Micro | 96.31 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Macro | 69.93 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Micro | 74.33 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | Precision @5 | 61.95 | Joint LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Macro | 94.88 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Micro | 96.29 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Macro | 69.99 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Micro | 74.46 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | Precision @5 | 62.01 | LAAT |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Macro | 93.07 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | AUC Micro | 94.05 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Macro | 65.33 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | F1 Micro | 69.23 | CAML |
| Medical Code Prediction | MIMIC-IV-ICD9-top50 | Precision @5 | 58.64 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Macro) | 93.6 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Micro) | 95.61 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (macro) | 69.01 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (micro) | 74.15 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | Precision@5 | 65.16 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Macro) | 93.37 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Micro) | 95.69 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (macro) | 70.31 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (micro) | 73.27 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | Precision@5 | 64.57 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Macro) | 93.39 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Micro) | 95.57 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (macro) | 68.41 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (micro) | 72.85 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | Precision@5 | 64.49 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Macro) | 93.21 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Micro) | 95.49 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (macro) | 68.15 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (micro) | 72.56 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | Precision@5 | 64.39 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Macro) | 91.05 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | AUC (Micro) | 93.18 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (macro) | 64.3 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | F1 (micro) | 67.56 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD10-top50 | Precision@5 | 59.58 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Macro | 13.94 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Micro | 61.15 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Macro AUC | 96.79 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Micro AUC | 99.56 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Precision@8 | 68.89 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Macro | 14.4 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Micro | 62.45 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Macro AUC | 96.61 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Micro AUC | 99.53 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Precision@8 | 70.34 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Macro | 14.17 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Micro | 60.37 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Macro AUC | 95.57 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Micro AUC | 99.49 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Precision@8 | 67.46 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Macro | 13.12 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Micro | 60.31 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Macro AUC | 95.18 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Micro AUC | 99.47 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Precision@8 | 67.47 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Macro | 11.06 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-full | F1 Micro | 57.28 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Macro AUC | 93.45 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Micro AUC | 99.29 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-full | Precision@8 | 64.91 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-AUC | 97.07 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-F1 | 5.42 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-AUC | 99.61 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-F1 | 55.91 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Precision@8 | 67.66 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-AUC | 93.64 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-F1 | 5.71 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-AUC | 99.27 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-F1 | 55.89 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Precision@8 | 66.89 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-AUC | 92.96 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-F1 | 4.47 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-AUC | 99.14 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-F1 | 55.4 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Precision@8 | 6697 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-AUC | 91.85 | PLM |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-F1 | 4.9 | PLM |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-AUC | 99.02 | PLM |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-F1 | 56.95 | PLM |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Precision@8 | 69.47 | PLM |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-AUC | 89.91 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Macro-F1 | 4.07 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-AUC | 98.79 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Micro-F1 | 52.67 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD-10-full | Precision@8 | 64.43 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Macro | 95.13 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Micro | 96.46 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Macro | 71.85 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Micro | 75.78 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | Precision @5 | 62.6 | MSMN |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Macro | 94.97 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Micro | 96.41 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Macro | 71.35 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Micro | 75.46 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | Precision @5 | 62.44 | PLM-ICD |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Macro | 94.92 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Micro | 96.31 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Macro | 69.93 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Micro | 74.33 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | Precision @5 | 61.95 | Joint LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Macro | 94.88 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Micro | 96.29 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Macro | 69.99 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Micro | 74.46 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | Precision @5 | 62.01 | LAAT |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Macro | 93.07 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | AUC Micro | 94.05 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Macro | 65.33 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | F1 Micro | 69.23 | CAML |
| Multi-Label Classification | MIMIC-IV-ICD9-top50 | Precision @5 | 58.64 | CAML |