Ancestry-Associated Performance Variability of Open-Source AI Models for EGFR Prediction in Lung Cancer
Menée à partir de données de séquençage portant sur 2 098 patients atteints d'un cancer du poumon et à l'aide d'images de lames histologiques d'échantillons tumoraux colorés à l'éosine et à l'hématoxyline (âge moyen : 66,6 ans ; 63 % de femmes), cette étude évalue la performance de deux outils "open source" utilisant l'intelligence artificielle pour prédire le statut mutationnel du récepteur EGFR
Importance : Artificial intelligence (AI) models are emerging as rapid, low-cost tools for predicting targetable genomic alterations directly from routine pathology slides. Although these approaches could accelerate treatment decisions in lung cancer, little is known about whether their performance is consistent across diverse patient populations and tissue contexts.
Objective : To evaluate the performance and generalizability of 2 open-source AI pathology models for predicting EGFR mutation status in lung adenocarcinoma (LUAD) across independent cohorts and ancestral subgroups.
Design, Setting, and Participants : This cohort study included patients with LUAD from 2 cohorts: Dana-Farber Cancer Institute (DFCI) from June 2013 to November 2023, and a European-based trial (TNM-I) from August 2016 to February 2022. All patients had paired next-generation sequencing data and hematoxylin-eosin–stained whole-slide images. In the DFCI cohort, genetic ancestry was inferred using germline genotype data. Data analyses were performed from July 2025 to September 2025.
Main Outcomes : The primary outcome was model performance for predicting EGFR mutation status, measured as the area under the receiver operating characteristic curve (AUC), evaluated overall and across ancestry subgroups and sample types.
Results : Overall, 2098 patients with LUAD were included (mean [SD] age, 66.6 [10.3] years; 1315 female individuals [63%] and 783 male individuals [37%]). In the DFCI cohort (n = 1759; 54 African, 101 American, 95 Asian, 1465 European), EGFR mutations were detected in 432 patients (25%). One AI-pathology model achieved an AUC of 0.83 (95% CI, 0.81-0.85) compared with 0.68 (95% CI, 0.65-0.70) for the other model. In the TNM-I cohort (n = 339), EGFR mutations were detected in 50 patients (15%), with AUCs of 0.81 (95% CI, 0.74-0.88) and 0.75 (95% CI, 0.68-0.83), respectively. In ancestry-stratified analyses of the DFCI cohort, AUCs for the higher-performing model were 0.84 (95% CI, 0.81-0.86) in patients of European ancestry, 0.85 (95% CI, 0.72-0.94) in African ancestry, and 0.68 (95% CI, 0.55-0.78) in Asian ancestry. In sample type analyses, performance declined in pleural (AUC, 0.66; 95% CI, 0.56-0.76) compared with lung specimens (AUC, 0.86; 95% CI, 0.83-0.88). AI-guided triage analyses showed a potential 57% reduction in rapid EGFR testing, while maintaining sensitivity of 0.84 and specificity of 0.99.
Conclusions : This cohort study found that AI-based pathology tools may serve as preliminary adjuncts for EGFR prediction in lung cancer, though performance differences by ancestry warrant careful interpretation.
JAMA Oncology , article en libre accès, 2026