1337. Reliable Deep Learning Model for Differentiating Glioblastoma from Single Brain Metastasis: Estimating Uncertainty with Deep Ensembles
Authors * Denotes Presenting Author
  1. Yae Won Park *; Yonsei University College of Medicine
  2. Sujeong Eom; Yonsei University College of Medicine
  3. Seng Chan You; Yonsei University College of Medicine
  4. Sung Soo Ahn; Yonsei University College of Medicine
  5. Seung-Koo Lee; Yonsei University College of Medicine
For clinical reliability, a deep learning model should suggest whether the prediction for a specific patient can be trusted or not. The purpose of this study was to develop a clinically reliable deep learning model to differentiate glioblastoma (GBM) from solitary brain metastasis (SBM) by providing predictive uncertainty estimates using deep ensembles.

Materials and Methods:
A total of 469 (300 GBM, 169 SBM) patients were enrolled in the institutional training set. A deep ensembles model based on DenseNet121 (2D CNN) was trained on multiparametric MRI. For comparison, a single network of DenseNet121 was also trained. The models were validated in the external validation set consisting of 143 (101 GBM, 42 SBM) patients. The classification performance was estimated, and entropy values for each input were evaluated for uncertainty measurement. Based on entropy values, the datasets were split to high- and low-uncertainty groups. To evaluate uncertainty on out-of-distribution (OOD) data from unseen classes, 318 patients with meningiomas were separately tested in the deep ensembles model.

Deep ensembles showed an AUC, accuracy, sensitivity, and specificity of 0.82, 78.3%, 54.8%, and 88.1%, respectively, on external validation in differentiating GBM from SBM. Deep ensembles showed higher performance in the low-uncertainty group (AUC, accuracy, sensitivity, and specificity of 0.78, 89.0%, 62.5%, and 94.2%, respectively) than in the high-uncertainty group (AUC, accuracy, sensitivity, and specificity of 0.64, 65.5%, 51.4%, and 77.1%, respectively) according to entropy values. On the OOD dataset of meningiomas, 292 (91.4%) patients were classified as high-uncertainty group by deep ensembles whereas only 202 (63.4%) patients were classified into the high-uncertainty group by the single DenseNet121.

Deep ensembles provide predictive uncertainty in differentiating GBM from SBM, and also help avoid overconfident predictions in unseen classes and offer reliable clinical decision-making .