Traitements
Ressources et infrastructures

Machine learning reveals country-specific drivers of global cancer outcomes

Menée à l'aide de données Globocan 2022 concernant les ratios mortalité/incidence des cancers (MIR) de 185 pays et à l'aide d'indicateurs de systèmes de santé, cette étude met en évidence l'intérêt d'un algorithme d'apprentissage automatique pour prédire avec précision les MIR nationaux, quantifier les facteurs qui contribuent aux MIR et prioriser les investissements

Background : Global inequities in access to cancer diagnostics and treatment contribute to wide variation in cancer mortality-to-incidence ratios (MIRs), a proxy for survival. We aimed to develop an interpretable machine learning framework to quantify country-specific health system contributors to MIR and inform policy prioritization.

Materials and methods : We assembled national MIRs from GLOBOCAN 2022 for 185 countries and health system indicators from multilateral sources, including gross domestic product (GDP) per capita, universal health coverage (UHC) index, radiotherapy centers per population, health spending (%GDP), out-of-pocket expenditure, work force densities (physicians; nurses/midwives; surgical work force), pathology availability, Human Development Index, and gender inequality index. A CatBoost gradient-boosting model was trained with repeated leave-one-country-out cross-validation (10 repeats; 1850 predictions). Nested hyperparameter optimization and strict leakage control were used. Model interpretability employed SHapley Additive exPlanations (SHAP; TreeExplainer) to generate global and country-level feature attributions. SHAP values, model-derived metrics quantifying each factor’s contribution to cancer outcomes, were generated. Performance metrics included R2, root mean squared error (RMSE), mean absolute error, and Pearson correlation; uncertainty was estimated by bootstrap resampling.

Results : The model showed strong out-of-sample performance [R2 = 0.852, 95% confidence interval (CI) 0.801-0.891; RMSE 0.057, 95% CI 0.050-0.064]; correlation between predicted and observed MIRs was r = 0.923 (P = 8.30 × 10

−

78). Global SHAP contributions ranked GDP per capita (22.5%), radiotherapy centers per population (15.4%), and UHC index (12.9%) as the leading determinants. Country-specific SHAP profiles revealed substantial heterogeneity in dominant drivers across settings, enabling tailored policy levers (e.g. infrastructure, coverage expansion, or financial protection). An accompanying web interface provides country-level SHAP summaries for decision support.

Conclusions : An explainable machine learning approach accurately predicts national MIRs and decomposes predictions into country-specific health system attributions. While ecological and noncausal by design, the SHAP profiles translate population-level associations into actionable hypotheses for prioritizing investments—highlighting, across many contexts, radiotherapy capacity and UHC expansion as recurrent levers, and underscoring that higher total health spending alone may be insufficient without strategic allocation. Prospective, country-specific evaluations are warranted to test whether targeting model-identified drivers improve cancer outcomes.

Annals of Oncology , article en libre accès, 2026

Voir le bulletin