E2800. Improving Interreader Agreement of MRI-Based Initial Rectal Cancer N-Staging
  1. Angelina Lo; University of California-Irvine School of Medicine
  2. Katherine Wei; University of California-Irvine Medical Center
  3. James Shi; University of California-Irvine Medical Center
  4. Rony Kampalath; University of California-Irvine Medical Center
  5. Roozbeh Houshyar; University of California-Irvine Medical Center
  6. Mohammad Helmy; University of California-Irvine Medical Center
  7. Sonia Lee; University of California-Irvine Medical Center
For rectal cancer lymph node(N) staging, recent studies have shown lymph node border characteristics and signal characteristics are an important predictor of involvement. However, judgment on lymph node signal heterogeneity, irregular borders, or roundness can be subjective. We used multiple educational interventions and studied their effect on improving interreader agreement.

Materials and Methods:
Six abdominal imaging faculty, one fellow, two residents, and three medical students in an NCI-designated cancer center with NAPRC accreditation were included. Subjects reviewed 4 surveys, and each survey contained a set of 25 lymph node images from initial staging rectal cancer protocol MRI scans. Assessment was recorded in REDCAP survey format. The first survey included 25 MRI images of single lymph nodes denoted with short axis. Readers were asked to determine if the lymph node is suspicious (N+) or not suspicious(N-) and report their confidence level. This first survey provided baseline assessment before any guidance. The second survey included the text description of Dutch criteria alongside each MRI lymph node image and asked for N+/N- and confidence level. The third survey asked radiologists to assess individual malignant morphologic characteristics of each lymph node, in addition to N+ or N- and confidence level. The fourth survey had participants review examples of malignant and benign morphologic characteristics before and in-between the cases, and then assess N+/N- and individual morphologic characteristics. Interreader agreement was assessed using Fleiss Kappa, and level of confidence was assessed by averaging score ranging -1, no confidence, 0 somewhat confident, and 1, very confident.

In the baseline assessment, the interreader agreement amongst all participants (n = 12) was fair, Fleiss Kappa 0.32 (CI 0.27 - 0.37). Interreader agreement increased as the level of training increased. When only faculty, fellows and residents were included (n=9), Fleiss Kappa was 0.50 (CI 0.44 - 0.57). Assessment of faculty (n = 6) demonstrated the highest baseline agreement, Fleiss Kappa 0.57 (CI .46 - 0.68). When comparing the four surveys, the overall Fleiss Kappa increased slightly compared to baseline when description of the Dutch criteria was provided in survey 2, Fleiss Kappa 0.34 (CI 0.29 - 0.40). When participants were asked to assess individual morphological characteristics with text guide only for survey 3, the interreader reliability for N+/N- decreased to Fleiss Kappa of 0.23 (0.17 - 0.29). When examples of malignant and benign characteristics are provided in survey 4, the agreement was highest at Fleiss Kappa 0.39 (CI 0.33 - 0.44).

At baseline, the interreader agreement of initial lymph node staging rectal cancer MRI is highest among radiologists with more experience. Overall Interreader agreement was highest when imaging examples of benign and malignant characteristics were provided. Our study suggests that adding specific imaging examples to standardized templates may be helpful in achieving the highest level of interreader consistency.