ARRS 2022 Abstracts


1964. Extracting Actionable Findings From Unstructured Reports Using an Advanced Natural Language Processing Model
Authors * Denotes Presenting Author
  1. Ali Tejani *; University of Texas Southwestern Medical Center
  2. Khadyonath Nanneboyina; University of Texas Southwestern Medical Center
  3. Yin Xi; University of Texas Southwestern Medical Center
  4. Kiran Batra; University of Texas Southwestern Medical Center
  5. Travis Browning; University of Texas Southwestern Medical Center
  6. Ronald Peshock; University of Texas Southwestern Medical Center
  7. Jesse Rayan; University of Texas Southwestern Medical Center
Structured reports provide standardized reporting language and are preferred by referring clinicians. However, not all radiologists use structured reports for a variety of reasons. Automated data curation from structured reports leveraging advanced natural language processing (NLP) models can address this discrepancy. Bidirectional Encoder Representations from Transformers (BERT) is a recently developed deep language representation model that has been shown to improve outcomes in NLP tasks over traditional methods, potentially offering a means of rapid and accurate extraction of actionable findings from unstructured reports. The purpose of this study was to design and assess the effectiveness of a pre-trained BERT model further trained on structured chest computed tomography (CT) reports in the classification of unstructured reports. 

Materials and Methods:
This is a retrospective, cross-sectional study of chest CT exams for pulmonary emboli (PE) detection from 8/31/2019 - 2/28/2020 at a large academic institution and an affiliated county hospital. "Impression" fields from structured reports were used as the reference standard regarding PE (positive, negative, and negative with limitation). A pre-trained BERT model (PubMedBERT) was fine-tuned on 80% of the structured reports and validated on the other 20% via cross-entropy at each epoch. The validated model was then tested on a set of unstructured reports not previously used in training or validation. Multi-class receiver operating characteristic (ROC) curve analysis was carried. Class-specific (“one versus rest”) areas under the ROC curve (AUC) and micro/macro averaged AUC were reported with bootstrap confidence intervals. Micro-average ROC/AUC was calculated by stacking all groups together, thus converting the multi-class classification into binary classification. Macro-average ROC/AUC was calculated by averaging all group results (“one versus rest”), and linear interpolation was used between points of ROC.

A total of 3894 exams were retrieved from the data warehouse, and 1264 exams were used in this study; 613 (48%) were non-structured reports. After testing on the non-structured reports, class-specific AUC was 0.96 (0.95 - 0.98) for "positive PE," 0.86 (0.83 - 0.91) for "negative PE," and 0.87 (0.84 - 0.90) for "negative PE with limitation." Both macro- and micro-average AUC were 0.90 (0.88 - 0.92). Further investigation showed that in all 32 "negative" or "negative with limitation" cases that were classified as "positive" by BERT, 28 of them included the wording "chronic PE" or represented a follow-up exam from a prior PE.

An advanced NLP model, BERT, can accurately classify unstructured reports after training on components of structured reports based on relatively small datasets. Utilizing such a model could provide a tool to effortlessly extract key data from unstructured reports while allowing radiologists to continue reporting in their format of preference. Future studies should also focus on discriminating acute and chronic PE.