A machine learning model integrating 13 nutrients with baseline characteristics demonstrated good predictive ability for Cardiometabolic Multimorbidity (CMM), with the SVM model achieving an external validation AUC of 0.874 and SHAP analysis identifying age, magnesium, BMI, total fat, vitamin B1, and dietary fiber as important contributors.
Key Findings
Results
The Support Vector Machine (SVM) model demonstrated the best predictive performance for CMM across populations, achieving an external validation AUC of 0.874.
Six machine learning models were trained and validated for generalizability on both NHANES and CHNS datasets
The SVM model achieved an external validation AUC of 0.874, indicating 'good predictive ability across populations'
Models were trained on NHANES data and externally validated on the CHNS dataset, testing cross-population generalizability
The analysis included 13 nutrients and seven demographic and health variables as model inputs
Results
SHAP analysis identified age, magnesium, BMI, total fat, vitamin B1, and dietary fiber as the most important contributors to CMM model predictions.
SHAP (SHapley Additive exPlanations) values were used to interpret feature contributions and understand variable relationships with prediction outcomes
Six variables were highlighted as important contributors: age, magnesium, BMI, total fat, vitamin B1, and dietary fiber
SHAP analysis was applied to understand the directionality and magnitude of each variable's contribution to predictions
The approach allowed interpretation of a complex machine learning model in terms of individual feature importance
Results
Magnesium, vitamin B1, and dietary fiber each showed inverse associations with CMM risk within the modeling framework.
All three nutrients — magnesium, vitamin B1, and dietary fiber — were negatively associated with CMM risk
These associations were identified 'within the modeling framework,' integrating both logistic regression and machine learning approaches
Data were drawn from two population-based databases: NHANES (U.S.) and CHNS (China), suggesting cross-population relevance
The authors caution that 'these associations should be interpreted with caution' and that further longitudinal and interventional studies are needed
Results
Total fat exhibited a complex and context-dependent relationship with CMM, showing an inverse association in logistic regression but playing a significant role in machine learning model prediction.
In logistic regression analysis, total fat exhibited an inverse association with CMM risk
Despite the inverse logistic regression association, total fat 'played a significant role in model prediction'
The authors conclude that 'its relationship with CMM may be complex and context-dependent'
This discrepancy between logistic regression and machine learning findings highlights the added value of non-linear modeling approaches
Methods
The study synthesized data from two population-based databases — NHANES and CHNS — incorporating 13 nutrients and seven demographic and health variables to assess CMM risk.
NHANES is a U.S.-based population database; CHNS is a China-based population database, providing cross-national data
A total of 13 nutrients were analyzed alongside seven demographic and health variables
Binary and gradient logistic regressions were used alongside six machine learning models to assess associations
The dual-database design allowed for both model training and external validation across distinct populations
Conclusions
Machine learning models integrating nutritional and baseline characteristics may provide a useful approach for predicting Cardiometabolic Multimorbidity risk.
CMM refers to the co-occurrence of multiple cardiometabolic conditions
The SVM model outperformed the other five machine learning models evaluated
The authors state that 'a machine learning model integrating nutritional and baseline characteristics may provide a useful approach for predicting CMM risk'
The study calls for 'further longitudinal and interventional studies' to clarify potential causal relationships between dietary factors and CMM
What This Means
This research suggests that a type of artificial intelligence called a Support Vector Machine (SVM), when fed information about a person's diet and basic health characteristics, can reasonably predict whether someone is likely to develop Cardiometabolic Multimorbidity (CMM) — a condition where a person has multiple heart and metabolic diseases at the same time, such as heart disease combined with diabetes or hypertension. The model was tested on data from both the United States (NHANES) and China (CHNS), and it performed well in both settings, achieving an accuracy score (AUC) of 0.874, which is considered good. Among the dietary and health factors analyzed, age and body mass index (BMI) were the strongest predictors, but several nutritional factors — particularly magnesium, vitamin B1 (thiamine), and dietary fiber — were also identified as important, with higher intake of these nutrients being associated with lower CMM risk.
One particularly interesting finding is that total fat intake showed a complicated relationship with CMM risk: a simple statistical analysis suggested it was inversely related to risk (i.e., higher fat intake linked to lower risk), but the machine learning model flagged it as an important and complex predictor, suggesting the relationship is not straightforward and may depend on other factors. This highlights how machine learning can detect patterns that traditional statistical methods might miss or oversimplify.
This research suggests that tracking dietary patterns — especially intake of magnesium, B vitamins, and fiber — alongside standard health metrics could improve early identification of people at higher risk for developing multiple cardiometabolic conditions. However, the authors themselves caution that this study cannot prove these dietary factors cause or prevent CMM, as both datasets are observational. They call for future studies that follow people over time or test dietary interventions to better understand whether changing these nutrients could actually reduce disease risk.
Liu R, Tang L, Zhang F, Li Y, Tang Q, Zhang X, et al.. (2026). Harnessing machine learning to decode dietary Impacts on cardiometabolic multimorbidity.. International journal of medical informatics. https://doi.org/10.1016/j.ijmedinf.2026.106502