E2203. An International “Datathon” Model as a Novel Means to Harness the Potential of Artificial Intelligence: A Cross-Institutional Experience
  1. Juncheng Huang; National University Health System
  2. Judy Gichoya; Emory University School of Medicine
  3. Meng Ling Feng; National University of Singapore
  4. Leo Celi; Harvard Medical School; Massachusetts Institute of Technology
AI can potentially revolutionize radiology practice. Key challenges in creating and applying AI research are: ensuring algorithm robustness; finding large data for a robust model; getting effort and expertise to curate data, and balancing patient privacy protection with deployment. Several radiology societies (SIIM, ACR and RSNA) host AI challenges with hundreds of participants while promoting education, e.g., National Informatics Conference (NIIC-RAD) and AI Editorial Experience (from RSNA’s AI trainee editorial board). This abstract describes another capacity building initiative - an international “datathon” model ( hackathon model in data analytics) through the MIT Critical Data initiative.

Materials and Methods:
We retrospectively reviewed datathons organized by the MIT Critical Data team from 2014, with a special review of the annual Singapore Healthcare AI and Datathon Expo, which has a heavy radiology AI presence. Participants were radiologists, computer scientists, public health experts, and statisticians. Real clinical datasets inspired by the MIMIC dataset from Beth Israel with local datasets (curated over years of events) were used. The Singapore datathon had 5 days of workshops preceding 2 days of datathon (coding and presentations). Six countries have participated annually (Singapore, Japan, Korea, India, Thailand, China), with international mentors from eminent backgrounds (e.g., MIT, Mayo Clinic).

From the first datathon in January 2014, 44 countries have hosted international datathons, with a key focus on ICU tabular data. Radiology AI was introduced in the Singapore datathon, and 153 teams have participated in 2019-2022 (13 radiology AI projects). Some primary datasets included MIMIC -CXR, CheXPERT, and Medical Segmentation Decathlon Images. A use case was an AI model for detecting pneumonia with high specificity, creatively using window normalization and augmentation to boost robustness of a CNN model, with multimodal data combining clinical and imaging data. The projects’ code and new annotations were made public. A challenge faced was lack of “machine learning ready” data; most time was spent annotating and preprocessing data. Unlike nonimaging datasets, no datathon host has availed internal data to participants.

The international datathon model enhances current efforts to build radiology AI capacity. Though local data sharing is limited, adopting projects like RSNA Medical Imaging and Data Resource Center and the NCI Cancer Research Data Commons fosters collaboration and boosts research quality. An international datathon model addresses key challenges in radiology AI research by improving data annotations, creating prefabricated notebooks and attracting global diverse talent.