E5368. Trends in Stroke Research: An Examination of Publication Patterns Using Topic Modeling
  1. Burak Ozkara; MD Anderson Cancer Center
  2. Mert Karabacak; Mount Sinai Health System
  3. Samir Dagher; MD Anderson Cancer Center
  4. Konstantinos Margetis; Mount Sinai Health System
  5. Max Wintermark; MD Anderson Cancer Center
  6. Vivek Yedavalli; Johns Hopkins Hospital
The massive growth in scholarly publications presents a hurdle for researchers, given the considerable amount of time required to compile and interpret these findings. Natural language processing and topic modeling can accelerate the condensation of academic literature. It offers an innovative method for discovering hidden themes within vast and diverse literature data, providing researchers with insights into research fields. In addition, topic modeling assists researchers in identifying significant trends, emerging interests, and declining areas in research. This exhibit aims to demonstrate the potential of these tools; we focused on stroke research, a significant global health issue.

Materials and Methods:
Articles for our study were obtained from the journals Stroke, International Journal of Stroke, European Stroke Journal, Translational Stroke Research, and Journal of Stroke and Cerebrovascular Diseases. On May 11, 2023, articles were retrieved using the Source-ID fields from the Scopus database. Included were only "Article" and "Review" document types. We employed BERTopic, a topic modeling technique that uses bidirectional encoder representations from transformers embeddings and class-based term frequency - inverse document frequency clustering. Following training, the model generated a collection of topics and representative documents. Following an analysis of keywords and representative documents, the authors labeled these topics by consensus. Word clouds were also generated to overview the topics' key terms briefly. A trend analysis for the current decade was conducted. All computational analysis was performed using Python 3.1.

First, 35,779 documents were collected. When the inclusion criteria were applied, 26,732 documents were included. Of these, 24,849 were classified into 30 distinct categories. The remaining 1883 documents were deemed anomalies because they did not fit neatly into any category. Animal models, rehabilitation, reperfusion therapy, small vessel disease, cerebral blood flow, intracranial aneurysms, cervical artery dissection, intracerebral hemorrhage, biomarkers, and cerebral venous thrombosis were the 10 most prevalent topics. The linear regression models provided an overview of the current decade's trends. They identified emboli, medullary and cerebellar infarcts, and glucose metabolism as the three most popular trends. The three least popular topics for this decade were cerebral venous thrombosis, statins, and intracerebral hemorrhage.

Our BERTopic-driven study provides a revealing look at stroke research, illuminating prevalent trends and evolving research interests. From animal models to environmental factors, our study revealed a vast array of research fields. Furthermore, our trend analysis spotlighted emerging hot topics like emboli, medullary and cerebellar infarcts, and glucose metabolism, and indicated a comparative decline in areas such as cerebral venous thrombosis, statins, and intracerebral hemorrhage. Overall, this study highlights the dynamic nature of stroke research.