Skip to main content
Evidence-Based Supplement Research
Evidence-Based Supplement Research

Human-AI teaming to improve accuracy and efficiency of eligibility criteria prescreening for oncology trials: a randomized evaluation trial using retrospective electronic health records.

  • 2026-02-03
  • Nature communications 17(1)
    • Ravi B Parikh
    • Likhitha Kolla
    • Elizabeth A Beothy
    • William J Ferrell
    • Brenda Laventure
    • Matthew Guido
    • Anthony Girard
    • Yang Li
    • Khaled Essam Mahmoud Dosoky
    • Karim Tarabishy
    • Parth S Patel
    • Ayana Andalcio
    • Kristin Maloney
    • Jose Ulises Mena
    • Wael Salloum
    • Jinbo Chen
    • Ezekiel J Emanuel

Study Design

Type
Clinical Trial
Sample size
n = 355
Population
355 patients with non-small cell lung or colorectal cancer
Methods
randomized noninferiority trial using retrospectively collected clinical charts comparing prescreening by trained research staff alone vs. augmented with a pre-trained language model
  • Large Human Trial
  • Rigorous Journal
Few adult patients with cancer enroll in oncology clinical trials. A rate-limiting step to trial enrollment is prescreening, involving clinical research staff manually abstracting unstructured health records to identify patients who meet eligibility criteria. Prescreening is time-consuming, labor-intensive, and prone to human error, resulting in under-identification of eligible patients. Neurosymbolic AI language models may approximate or improve the accuracy of prescreening through automated abstraction of enrollment criteria from longitudinal unstructured patient charts. We conduct a randomized noninferiority trial using retrospectively collected clinical charts to compare the accuracy and efficiency of prescreening by trained research staff alone (Human-alone) vs. augmented with a pre-trained language model (Human+AI), among a cohort of 355 patients with non-small cell lung or colorectal cancer. Sample size is determined from analyses of a preliminary dataset as well as a prespecified, interim dataset of 74 charts. Chart-level accuracy, the primary endpoint of Human+AI prescreening is noninferior and superior to Human-alone (76.5% vs. 71.1%). However, efficiency is unchanged with similar average time per chart review, the secondary endpoint, (37.4 vs. 37.8 min). AI-assisted abstraction most improves accuracy for biomarker, staging, and response criteria. Performance is limited in some domains due to automation bias. Although improvements are modest, this large randomized trial evaluating a human-AI framework for oncology prescreening shows that AI language models can approximate and augment human-driven prescreening to enhance identification of trial-eligible patients, potentially increasing enrollment. The trial is registered on ClinicialTrials.gov (NCT06561217).

Research Insights

    Back to top