1866. Machine Learning Analysis of Two-Dimensional Breast Ultrasound to Classify Triple Negative Breast Cancer and HER2+ Breast Cancer Subtypes
Authors * Denotes Presenting Author
  1. Janne Elst *; Sunnybrook Health Sciences Center
  2. Sean Senthilnathan; Sunnybrook Health Sciences Center
  3. Andrew Lagree; Sunnybrook Health Sciences Center
  4. William Tran; Sunnybrook Health Sciences Center
  5. Belinda Curpen; Sunnybrook Health Sciences Center
Early diagnosis of triple-negative (TN) breast cancer and human epidermal growth factor receptor 2 positive (HER-2+) breast cancer is important due to their more aggressive biological characteristics, poorer clinical outcomes, and limited or targeted options for therapy, respectively. The aim of this study is to evaluate the diagnostic performance of machine learning in differentiating newly diagnosed malignant breast masses into TN versus non-TN and HER-2+ versus HER-2 negative (HER-2-) breast cancer based on conventional ultrasound (US) b-mode imaging.

Materials and Methods:
A retrospective chart review was carried out, which identified 88 adult female patients who underwent diagnostic US, US guided core biopsy of the breast, and had invasive malignancy confirmed with pathology, between 2011-2019 in the Rapid Diagnostic Unit. Cases were classified as: TN versus non-TN and HER-2+ versus HER-2- breast cancer according to the molecular subtyping performed by the pathology department. Ultrasound breast masses were annotated by a board-certified radiologist with 25 years of experience and a Women’s Imaging fellow. First and second order radiomic features were extracted, per image, using open-source software (i.e. HistomicsTK1, PyRadiomics2). A stepwise discriminant function was used, which selected the most relevant and correlated features (selection criterion; r2>0.8, features=8). Supervised machine learning classifiers included: 1) logistic regression, 2) support vector machine, and 3) k-nearest neighbor. Within the machine learning frameworks, all data were partitioned into a 3:1 ratio. 10-fold cross-validation was applied to the training dataset to optimize the efficacy and generalization of the models. Area under the receiver operating characteristic (ROC) curve (AUC), sensitivity, and specificity were calculated for both the TN and HER-2 models. The unseen (test) dataset was used to report classification performance.

The logistic regression classifier demonstrated an AUC of 0.824 (sensitivity: 81.8%, specificity: 74.2%) for the TN model, and an AUC of 0.778 (sensitivity: 71.4%, specificity: 71.6%) for the HER2 model.

Machine learning analysis of b-mode breast ultrasound images demonstrated a high diagnostic accuracy in differentiating the TN versus non-TN and HER-2+ versus HER-2- breast cancer subtypes. The identification of potentially more aggressive breast cancer subtypes early in the diagnostic process could help in achieving better prognoses by prioritizing clinical referral and prompting early adequate treatment.