ARRS 2022 Abstracts

RETURN TO ABSTRACT LISTING


1238. Harnessing Artificial Intelligence to Augment Meaningful Peer Review: Missed Liver Lesions on Computed Tomography (CT) Pulmonary Angiography
Authors * Denotes Presenting Author
  1. Sarah Thomas; Duke University
  2. Tyler Fraum *; Mallinckrodt Institute of Radiology
  3. Lawrence Ngo; CoRead AI
  4. Mustafa Bashir; Duke University
  5. Benjamin Wildman-Tobriner; Duke University
Objective:
Traditional, random peer review in radiology is subject to several shortcomings including underreporting of errors and low frequency of errors, requiring review of large numbers of studies. Recently, proposals for ‘more meaningful’ peer review have focused on peer learning and nonrandom case selection. The purpose of this study was to utilize artificial intelligence (AI) to facilitate peer review for the detection of suspicious liver lesions (SLLs) on CT pulmonary angiography (CTPA).

Materials and Methods:
This retrospective study included consecutive CTPAs performed on adult patients during a 1-month period at hospitals serviced by a large, multisite teleradiology firm. Proprietary visual classification (VC) software was used to evaluate images to detect SLLs. The criteria for SLLs were adopted from “Management of Incidental Liver Lesions on CT: A White Paper of the ACR Incidental Findings Committee” (> 1 cm; > 20 Hounsfield units). Each study was labeled by the VC as either positive (VC+) or negative (VC-) for SLL. Separately, a natural language processing (NLP) algorithm evaluated each radiology report and assigned a label of positive (NLP+; contains a description of SLL) or negative (NLP-). The VC and NLP assessments were then used to identify SLL that may have been missed during initial clinical interpretation (false negatives). CTPA images classified VC+ and NLP- underwent review by three fellowship-trained abdominal radiologists. First, a few select images classified as VC+ were assessed by a single reviewer and determined to be definitely negative (e.g., gallbladder, colon, other normal structures) or not definitely negative. Then, for cases considered not definitely negative, reviewers assessed full CTPA image stacks and classified them as positive or negative for SLLs based on 2/3 consensus. The number VC+/NLP- cases, number of initial images needing radiologist review, and number of cases of missed SLLs were recorded. Interobserver agreement for SLLs was calculated for the radiologist readers.

Results:
In total, 2,573 CTPA images were assessed, and 136 were classified as potentially containing missed SLLs (VC+/NLP-). After radiologist review, 13 cases with missed SLLs were confirmed, representing 0.5% (13/2,573) of CTs. Using AI, the ratio of CTs requiring review to missed SLLs identified was 11:1; the ratio without the help of AI was 198:1. The ratio of VC+ images needing radiologist review to missed SLLs identified was 19:1. Among the 136 cases reviewed by radiologists, interobserver agreement for SLLs was near perfect (kappa = 0.92).

Conclusion:
Artificial intelligence can enable meaningful peer review by rapidly assessing thousands of examinations to identify potentially clinically significant errors. Although radiologist involvement is still necessary, the amount of effort required after initial AI screening is dramatically reduced, and performance of these techniques can inform future quality improvement initiatives and bolster the numbers of cases available for peer learning.