Cardiovascular

Machine learning-extended sPESI for 1-year mortality prediction in pulmonary embolism.

TL;DR

A machine learning extension of the simplified pulmonary embolism severity index using a 10-item hybrid model with continuous physiological variables significantly enhanced prognostic performance compared with conventional 6-item sPESI across all time horizons (30, 180, and 365 days), with XGBoost achieving AUC of 0.796 at 30 days and all ML models outperforming sPESI at 12 months.

Key Findings

Conventional sPESI discrimination attenuated progressively over time across three mortality horizons.

  • AUC for conventional sPESI was 0.727 at 30 days, 0.689 at 180 days, and 0.665 at one year.
  • The study used a retrospective cohort of 2547 adults with CT pulmonary angiography confirmed PE.
  • This attenuation motivated the development of ML extensions to improve long-horizon risk stratification.

The 10-item ML extension significantly enhanced prognostic performance compared with conventional 6-item sPESI across all time horizons.

  • All comparisons between ML models and conventional sPESI were statistically significant (all p < 0.05).
  • The 10-item hybrid extension retained continuous age, heart rate, systolic blood pressure, and oxygen saturation alongside the six binary sPESI indicators.
  • Three model types were evaluated: logistic regression (LR), XGBoost, and multi-layer perceptron (MLP).
  • Models were trained and optimized using 12-month all-cause mortality as the target outcome.

At the primary 12-month horizon, all three ML models outperformed conventional sPESI in discrimination.

  • LR, XGBoost, and MLP achieved AUCs of 0.720, 0.726, and 0.712, respectively, at 12 months.
  • Conventional sPESI AUC at 12 months was 0.665.
  • XGBoost achieved the highest AUC among ML models at the 12-month horizon (0.726).

XGBoost achieved peak discrimination at 30 days with an AUC of 0.796.

  • The 30-day AUC of 0.796 for XGBoost compared favorably to 0.727 for conventional sPESI at the same horizon.
  • Peak discrimination across all models and all horizons was achieved by XGBoost at the 30-day timepoint.

ML models substantially improved 30-day specificity compared to conventional sPESI while maintaining high sensitivity.

  • ML models increased 30-day specificity to 0.326–0.395 compared to 0.131 for conventional sPESI.
  • High sensitivity was maintained across ML models at 0.930–0.982 at the 30-day horizon.
  • This represents a two- to three-fold improvement in specificity over conventional sPESI.

SHAP analyses revealed horizon-linked attribution shifts in the relative importance of predictors across mortality timepoints.

  • Acute hemodynamic markers dominated early (30-day) mortality predictions.
  • Age and malignancy emerged as the primary drivers at 12 months under horizon-specific evaluation.
  • SHapley Additive exPlanations (SHAP) analyses were used to interpret feature contributions across models.
  • This finding suggests that different physiological and clinical variables carry different prognostic weight depending on the time horizon.

The study used a retrospective cohort design with a held-out test set and 10-fold cross-validation for model development and evaluation.

  • The cohort comprised 2547 adults with CT pulmonary angiography confirmed acute pulmonary embolism.
  • Models were evaluated on a held-out test set at 30, 180, and 365-day mortality horizons.
  • Model optimization used 10-item hybrid sPESI features with continuous physiological variables replacing dichotomized versions.

Have a question about this study?

Citation

Verdi E. (2026). Machine learning-extended sPESI for 1-year mortality prediction in pulmonary embolism.. Tuberkuloz ve toraks. https://doi.org/10.5578/tt.2026011217