Leveraging Large Language Models to Identify Lung Cancer Patients with Unregistered World Trade Center Disaster Exposure
Menée aux Etats-Unis à l'aide de grands modèles de langage, cette étude identifie une sous-population de patients atteints d'un cancer du poumon après exposition à la catastrophe du World Trade Center sans que cette exposition ait été enregistrée puis examine comment cette exposition a influencé l'évolution de la maladie
Objective : Leveraging large language models (LLM), we identified an unregistered subpopulation of individuals with World Trade Center-related exposure and lung cancer who were not previously captured in registries, and assessed how this exposure impacted patients' disease course.
Methods : Associations between exposure type and smoking history, cancer stage, mutation status, disease progression, and survival were statistically analyzed.
Results : The highest proportion of never-smokers was observed among residents, compared to first responders and commuters (19% and 24%; p = 0.005). Residents had more than twice the risk of disease progression (HR = 2.14, p = 0.008) and an elevated risk of death (HR = 2.43, p = 0.03). Only EGFR mutations were significantly associated with exposure type (p = 0.01).
Conclusions : This work highlights that LLM can capture a greater population of WTC survivors, including genetics.
Journal of Occupational and Environmental Medicine , résumé, 2026