1605. Artificial Intelligence System for Breast Cancer Detection using Deep Learning and Ultrasound Imaging
Authors * Denotes Presenting Author
  1. Jamie Oliver *; NYU Grossman School of Medicine
  2. Laura Heacock; Department of Radiology, NYU Grossman School of Medicine
  3. Beatriu Reig; Department of Radiology, NYU Grossman School of Medicine
  4. Alana Lewin; Department of Radiology, NYU Grossman School of Medicine
  5. Yiqiu Shen; Center for Data Science, New York University
  6. Farah Shamout; Engineering Division, NYU Abu Dhabi
  7. Krzysztof Geras; Center for Data Science, New York University; Department of Radiology, NYU Grossman School of Medicine
Breast ultrasound (US) is an important tool in the detection and characterization of breast masses. While studies showed that breast US consistently detected additional cancers when used as a supplemental screening modality, breast US has been noted to have high false-positive rates, and a lower sensitivity compare to breast MRI. In this study, we present an AI system that aims to assist radiologists in interpreting breast US exams.

Materials and Methods:
We developed and evaluated an AI model using a deep convolutional neural network (DCNN) inspired by the Globally-Aware Multiple Instance Classifier. It was designed to automatically detect and classify breast lesions on ultrasonography and did not require manual annotation from radiologists. The model was trained using our large-scale institutional dataset of 345,370 breast US exams acquired from 168,282 patients between 2012 and 2019. Pathology was the reference standard. This dataset was split on a patient level into training (70%), validation (10%), and test datasets (20%). We validated our model with a reader study with 13 readers of variable expertise (9 radiologists: average 14.5 years of experience, 4 trainees: 0-6 months of experience). Each reader reviewed 250 breast US exams that were randomly sampled from the test set. A hybrid decision-making model was then created for each reader which made predictions by evenly weighting the predictions of the reader and AI system. Diagnostic accuracy of the AI model, readers, and hybrid models were assessed and compared using receiver operating characteristic analyses.

Among the 250 US exams included in the reader study, 392 breasts were imaged, 38 of which had malignant lesions. The AI model achieved an area under the receiver operating characteristic (AUC) of 0.880, significantly outperforming each of the 13 readers (P<0.01). The performances of the readers ranged between 0.788 – 0.862 AUC, achieving 0.818 AUC on average. Reader performance varied by level of training, with attending radiologists having an average diagnostic accuracy of 0.824 AUC while trainees achieved an average of 0.805 AUC. The hybrid predictions of each reader and the AI model led to significant improvements for each reader (P<0.01), with an improved mean diagnostic accuracy of 0.852 AUC for all readers. The average performance of both attending radiologists and trainees increased to 0.863 and 0.826 AUC, respectively.

We present an AI system trained with automated feature extraction that is capable of detecting and diagnosing cancer on breast ultrasound with accuracy comparable to radiologists. Our hybrid decision-making models demonstrate the potential for this AI system to complement and enhance the performance of both trainee and experienced breast imagers without the added cost of a second human reader. If implemented, this may allow may allow breast imagers to potentially improve diagnostic accuracy.