Cardiovascular

Machine-Learning-Based Prediction of Hypertension and Its Risk Factors Among Adults in the Northern Region of Bangladesh.

TL;DR

A machine-learning-based study in rural Bangladesh found a hypertension prevalence of 36.5% among adults aged ≥30 years, with the random forest model achieving the highest predictive performance (AUC 0.80), and age, body weight, sweets consumption, vigorous activity, education, family size, height, and family income emerging as the most influential determinants.

Key Findings

The prevalence of hypertension among study participants in Dinajpur District, Bangladesh was 36.5%.

  • Community-based cross-sectional study conducted among 1026 adults aged ≥30 years.
  • Data collected between December 2024 and February 2025.
  • Participants were from the northern region of Bangladesh (Dinajpur District).
  • Data on sociodemographic, behavioral, and clinical characteristics were collected through a structured questionnaire.

The random forest model achieved the highest predictive performance among the five machine learning algorithms evaluated.

  • Random forest achieved 72% accuracy, 71% precision, 72% recall, 71% F1 score, and an AUC of 0.80.
  • Five ML algorithms were compared: logistic regression, decision tree, random forest, extreme gradient boosting, and light gradient boosting machine.
  • Models were evaluated based on accuracy, precision, recall, F1 score, and area under the curve (AUC).
  • Feature selection was performed using recursive feature elimination (RFE), Boruta-based feature selection (BFS), and random forest (RF) methods.

Thirteen significant predictors of hypertension were identified, with age, body weight, sweets consumption, vigorous activity, education, family size, height, and family income as the most influential determinants.

  • SHAP (SHapley Additive exPlanations) analysis was employed to interpret model outputs and rank predictor importance.
  • Predictors included both modifiable risk factors (sweets consumption, vigorous activity) and non-modifiable or sociodemographic factors (age, education, family size, family income).
  • Anthropometric measures including body weight and height were among the top predictors.
  • A total of 13 significant predictors were identified across the feature selection methods used.

Hypertension burden is rising in low- and middle-income countries like Bangladesh, and predictive modeling using advanced analytical approaches remains limited in rural populations.

  • Hypertension is described as a major contributor to cardiovascular morbidity and mortality globally.
  • The study notes that despite rising prevalence, ML-based predictive modeling of HTN in rural Bangladeshi populations has been limited.
  • The study was conducted in Dinajpur District, a rural area in northern Bangladesh.
  • The study aimed to address this gap by applying ML techniques to a community-based sample.

Machine learning models were found to have potential for predicting hypertension and identifying modifiable risk factors to support early detection and targeted interventions in rural Bangladesh.

  • The findings are described as providing 'actionable insights to support early detection, targeted interventions, and effective resource allocation for HTN prevention and control in rural Bangladesh.'
  • SHAP analysis provided interpretability of model outputs, identifying both modifiable and non-modifiable risk factors.
  • The RF model's AUC of 0.80 indicates good discriminatory performance for hypertension prediction.
  • The authors highlight that identified modifiable risk factors such as sweets consumption and vigorous activity could be targets for intervention.

Have a question about this study?

Citation

Resma M, Karim M, Sifat I, Kayum M, Kibria M. (2026). Machine-Learning-Based Prediction of Hypertension and Its Risk Factors Among Adults in the Northern Region of Bangladesh.. Journal of diabetes research. https://doi.org/10.1155/jdr/1799434