1708. Density Measurement for Tomosynthesis in Comparison: Deep Convolutional Neural Networks Versus Human BI-RADS Density Scores
Authors * Denotes Presenting Author
  1. Noemi Schmidt *; University Hospital Basel
  2. Karol Borkowski; University Hospital Zurich
  3. Patryk Hejduk; University Hospital Zurich
  4. Bram Stieltjes; University Hospital Basel
  5. Claudia Buehler; University Hospital Basel
  6. Thomas Weikert; University Hospital Basel
  7. Sophie Dellas; University Hospital Basel
High breast density is a well-known risk factor for breast cancer. The aim of this study was to develop and adapt two deep convolutional neural networks (DCNN) for an automatic classification of breast density based on the mammographic appearance of the tissue on synthetic 2D tomosynthesis cranicaudal (CC) and mediolateral oblique (MLO) projections.

Materials and Methods:
In this study, 5008 mammography tomosynthesis-based synthetic images from 1285 different patients (57 ± 37 years) were downloaded from the picture archiving and communications system of our institution and labeled according the ACR density (a-d) by a radiologist with 2 years of experience in breast imaging. Two DCNN with 11 convolutional layers and 3 fully connected layers were trained with 70% of the data, whereas 30% were used for validation. The models were finally applied to a test dataset of 460 images. Those were completely independent from the data used for training and validation. The models were tested against the following three readers and accuracies (correct classifcations / all classifications) were calculated: I) a radiologist with 2 years of experience in mammographic imaging (reader1A), II) the same breast radiologist 1 month later (reader1B), III) a radiologist with 11 years of experience in mammographic imaging (reader 2). Inter- and intra-reader reliabilities of the density classifications between both readers were assessed by calculating Cohen’s kappa coefficients with quadratic weights.

Two separate models for MLO and CC projections were successfully trained. For both models the “sweet spot“ for training avoiding overfitting required 160 epochs. The average accuracies of the DCNNs compared to reader1A were 82 % on CC and 86% on MLO projections with a 95% confidence interval. Compared to reader1B, the second read of reader1 one month later, the average accuracies of the DCNN were 80 % (CC) and 84% (MLO). A good accordance was also found between reader2 and the DCNN reaching an average accuracy of 76% for CC and 81% MLO, respectively. For combined MLO and CC projections the intra-reader agreement for reader1 was “almost perfect” with a kappa score of 0.82. The inter-reader agreement between reader1 and reader2 was evaluated as “substantial” amounting a kappa of 0.77.

DCNNs may be used to mimic human decision making in the breast density assessment for synthetic 2D tomosynthesis with high accuracy. The proposed technique may be useful for accurate, standardized and observer independent breast density evaluation of tomosynthesis.