E2658. Deep Learning Prostate MRI Segmentation Robustness and Accuracy: A Systematic Review
  1. Mohammad-Kasim Fassia; New York Presbyterian Weill Cornell
  2. Adithya Balasubramanian; New York Presbyterian Weill Cornell
  3. Sungmin Woo; Memorial Sloan Kettering Cancer Center
  4. Herbert Vargas; Memorial Sloan Kettering Cancer Center
  5. Hedvig Hricak; Memorial Sloan Kettering Cancer Center
  6. Anton Becker; Memorial Sloan Kettering Cancer Center
Prostate MRI is an integral component of detection, staging and surveillance of prostate cancer (PCa). Prostate MRI segmentation is an area of active research given its applications to prostate volume estimation, lesion localization and progression monitoring. Despite its applications, segmentation remains technically and logistically challenging. Ambiguous prostatic boundaries, heterogeneous tissue and glandular variability can challenge even the seasoned radiologist. In addition, manual segmentation is very time-consuming. Deep learning algorithms are an emerging solution to automate prostate segmentation, however questions remain about its robustness and accuracy. In this systematic review, we aim to comprehensively analyze deep learning algorithm performance, and its dependance on training data size, training data type, MRI scanner type, and deep-learning network architecture.

Materials and Methods:
A search of Embase and Pubmed databases was performed of English-language articles up to July 31, 2022 and a total of 205 articles were initially aggregated. The following search was used for both databases: “magnetic resonance imaging (OR MRI) prostate segmentation deep learning (OR machine learning OR artificial intelligence) automated (OR automated OR automatic)”. Inclusion criteria included whole gland prostate segmentation, peripheral zone segmentation, central zone segmentation, convolutional neural networks (CNN), and multilayer convoluted neural networks (deep learning). Exclusion criteria included prostate lesion segmentation and machine learning algorithms that did not utilize deep learning architecture. We reviewed segmentation techniques based on prostate zone, input dimensions, data source, MRI sequence, MRI field strength, sample size, technical approach and performance score measures.

Twenty-one studies were included in the systematic review. Two studies (9.5%) utilized a 1.5 Tesla (T) scanner, eight studies (38.1%) used a 3T scanner, and eleven studies (52.5%) used a combination. Four studies (19.0%) required the use of additional scanning equipment notably, phase array coils or endorectal coils. The mean training data size for all studies was 181, ranging from 40 to 550. Eight studies (38.1%) used internal datasets, six studies (28.6%) used external datasets, and seven studies (33.3%) used both. UNet was the most common deep learning network architecture (57.1% of studies) followed by VNet (19.0%) and FCN (9.5%). Seven studies (33.3%) employed additional filtering layers on top of the UNet architecture. Sixteen studies (76.2%) measured whole gland segmentation accuracy with a mean Dice-Sorensen coefficient (DSC) of 0.861, ranging from 0.630 to 0.944. Five studies (23.8%) measured peripheral zone segmentation, with a mean DSC of 0.765, ranging from 0.690 to 0.820, and three studies (14.2%) measured transition zone segmentation with a mean DSC of 0.856, ranging from 0.778 to 0.938.

Performance of published deep learning prostate segmentation algorithms is within the reported range for expert radiologist segmentations.