E2178. Comparison of a Deep Learning Model to Radiologist Diagnostic Performance in Sonographic Breast Mass Assessment
  1. Aileen Chang; Santa Clara Valley Medical Center
  2. Ran Pang; Santa Clara Valley Medical Center
  3. Christopher Nguyen; Santa Clara Valley Medical Center
  4. Pradnya Patel; Santa Clara University
  5. Yuling Yan; Santa Clara University
  6. Mahesh Patel; Santa Clara Valley Medical Center
  7. Young Kang; Santa Clara Valley Medical Center
To evaluate whether incorporating a deep learning system would improve radiologist diagnostic performance in differentiating between benign versus malignant sonographic breast masses.

Materials and Methods:
A novel deep learning model using a convolutional neural network (CNN) was constructed then trained and validated on an open source breast ultrasound dataset by Al-Dhabyani et al. The deep learning model was then applied to 300 retrospectively gathered sonographic images of breast masses that were previously biopsied and received a pathologic diagnosis at our institution. Of these masses, 194 were benign (64.7%) and 106 were malignant (35.3%). The CNN was tested on this dataset which classified the results as either “benign” or “malignant”. The same images were analyzed by a radiologist blinded to the pathologic results. The radiologist first classified the masses as benign or malignant independent of the CNN, then incorporated the CNN results into their assessment. The diagnostic performance of the radiologist for the two datasets (sonographic images alone versus with CNN results) was calculated based on the pathologic diagnosis.

The CNN model and radiologist had a sensitivity of 88.6% and 73.6%, specificity of 93.3% and 70.6%, and accuracy of 91.7% and 71.7%, respectively. The CNN model demonstrated higher diagnostic accuracy than the radiologist alone. When CNN results were incorporated with the sonographic images, radiologist diagnostic performance showed improvement in sensitivity, specificity, and accuracy (93.4%, 86.6%, and 89.0%, respectively). Regarding changes in management decision, incorporation of the deep learning model would have led to a correct change of biopsy to follow-up in 19.1% of pathologically benign masses and correct change of follow-up to biopsy in 22.6% of pathologically malignant masses. Follow-up would have been changed incorrectly from follow-up to biopsy in 3.1% of benign masses and biopsy incorrectly changed to follow-up in 2.7% of malignant masses.

Incorporating deep learning resulted in greater diagnostic accuracy than with radiologist assessment alone. Although testing was performed on a small dataset, preliminary results suggest that deep learning may be utilized in radiologists’ workflow to improve differentiation between benign and malignant masses demonstrated on breast ultrasound.