Body Composition

Enhancing Hypertension Risk Diagnosis Using a Hybrid Machine Learning Framework: Leveraging Body Composition Data.

TL;DR

A dual-scenario hybrid machine learning framework integrating unsupervised clustering with supervised classification using noninvasive body composition features achieved superior hypertension risk prediction, with ExtraTrees classifier on cluster-augmented data yielding accuracy of 98.23% and AUC of 99.87%.

Key Findings

Five physiological subgroups were identified among hypertensive individuals via K-Means clustering with validated cluster quality metrics.

  • Clustering was performed exclusively on hypertensive individuals using an unsupervised approach inspired by self-labeling principles.
  • Cluster quality was validated using Silhouette index (0.3371), Davies-Bouldin index (1.0094), and Calinski-Harabasz index (720.10).
  • Significant intercluster variability was observed across key indicators including FATP, RLFATP, LLFATP, FATM, and age (p < 0.001).

In Scenario 1, the SVM model with random oversampling achieved the best performance for hypertensive subgroup discrimination.

  • SVM with random oversampling achieved accuracy = 99.08%, F1 = 98.04%, and AUC = 99.98%.
  • Five models were tested for subgroup classification within the hypertensive population.
  • This scenario prioritized interpretability through subgroup discovery rather than binary healthy vs. hypertensive classification.

In Scenario 2, the ExtraTrees classifier on a cluster-augmented dataset achieved the best binary classification performance between healthy and hypertensive subjects.

  • ExtraTrees achieved accuracy = 98.23%, recall = 98.30%, precision = 98.17%, F1 = 98.23%, and AUC = 99.87%.
  • Five models were evaluated: ExtraTrees, KNN, SVM, Gaussian Naive Bayes, and Decision Tree, across multiple configurations.
  • The cluster-augmented dataset outperformed non-augmented configurations, confirming the benefit of integrating clustering information.

Clustering and feature selection both improved model generalization, particularly for ensemble-based learners.

  • The cluster-augmented dataset yielded the best overall results in Scenario 2.
  • Ensemble-based learners such as ExtraTrees showed the greatest benefit from clustering augmentation and feature selection.
  • Scenario 2 demonstrated the highest predictive accuracy and stability compared to Scenario 1.

The study used noninvasive body composition features as inputs for hypertension risk prediction.

  • Key features included FATP (fat percentage), RLFATP (right leg fat percentage), LLFATP (left leg fat percentage), FATM (fat mass), and age.
  • The noninvasive nature of body composition measurements was highlighted as contributing to clinical applicability.
  • The framework was designed to enhance both interpretability and predictive reliability using these features.

Integrating unsupervised clustering with supervised classification was found to offer a robust and explainable framework for personalized hypertension risk prediction.

  • Scenario 1 provided interpretability through subgroup discovery among hypertensive individuals.
  • Scenario 2 provided higher predictive accuracy and stability through binary classification.
  • The combined dual-scenario approach was presented as contributing to early detection and precision healthcare.

Have a question about this study?

Citation

Mirzaye A, Saadatfar H, Nematollahi M. (2026). Enhancing Hypertension Risk Diagnosis Using a Hybrid Machine Learning Framework: Leveraging Body Composition Data.. BioMed research international. https://doi.org/10.1155/bmri/6335947