E3445. Automated Detection of Cervical Spinal Cord Compression via Vision Transformer
  1. David Payne; Stony Brook University Hospital
  2. Xuan Xu; Stony Brook University Hospital
  3. Katherine Ferra Pradas; Stony Brook University Hospital
  4. Farshid Faraji; Stony Brook University Hospital
  5. Kevin John; Stony Brook University Hospital
  6. Lev Bangiyev; Stony Brook University Hospital
  7. Prateek Prasanna; Stony Brook University Hospital
Cervical spinal cord compression, defined as spinal cord deformity as a consequence of severe narrowing of the central canal in the cervical region, can lead to severe consequences for patients including severe pain, sensory disturbance, paralysis, and even death. Cervical spinal cord compression may require emergent intervention to avert or reduce patient morbidity and mortality. Despite the critical nature and need for timely intervention in the setting of cervical spinal cord compression, no automated tool is available to alert clinical radiologists to the presence of such critical findings on the work list. This study aims to demonstrate the ability of a vision transformer (ViT) machine learning model for the accurate detection of cervical cord compression, both at the slice level and patient level.

Materials and Methods:
A clinically diverse cohort of 142 cervical spine MRIs was identified, 34% of which were normal or had mild stenosis, 31% with moderate stenosis, and 35% with cord compression, 51% of patients were women, and average age was 56 years. Utilizing axial gradient echo images, slices were labeled as 0 = no cord compression/mild stenosis, 1 = moderate stenosis, and 2 = cord compression. Segmentation of the area of interest, the central canal, was performed with ITK-SNAP. These tasks were performed by three senior radiology residents in consensus with a senior neuroradiologist. A pretrained ViT model was fine-tuned to predict slice level severity using a train:validation:test split of 60:20:20 with 200 epochs. Each examination, i.e., patient-level study, was assigned an overall severity score based on the highest level of slice severity, with an examination labeled as positive for cord compression if one or more slices was predicted in the severe category. Additionally, two popular convolutional neural network (CNN) models (ResNet50, DenseNet121) were trained and tested in the same manner.

The ViT model outperformed both CNN models at the slice level, achieving slice-level accuracy of 82%, compared with 72% and 78% for ResNet and DenseNet121 respectively. Patient-level classification using ViT achieved accuracy of 93%, sensitivity of 0.90, PPV of 0.90, specificity of 0.95, and NPV of 0.95. Two cases within the testing set were misclassified, one falsely negative and the other falsely positive for cord compression. Patient level accuracy was 62% when utilizing each of the CNN models.

This classification approach using a ViT model accurately detects the presence of cervical spinal cord compression at the patient level. In this study, the ViT model outperformed both conventional CNN approaches both at the slice and patient level. If implemented into the clinical setting, such a tool may significantly streamline neuroradiology workflow, improving efficiency and consistency.