Exercise & Training

Personalized Glucose Management With AI: Pilot Study Using a Multiarmed Bandit Approach.

TL;DR

A multiarmed bandit approach using a two-stage reward prediction model for personalized dietary and exercise recommendations demonstrated significant improvement in postprandial glucose levels in simulation and a 23% average improvement in actual glucose responses in a small real-world experiment with 6 participants.

Key Findings

The proposed multiarmed bandit algorithm significantly improved postprandial glucose levels compared to a randomized policy in simulation experiments.

  • The method uses a two-stage reward prediction model where actions are combinations of total carbohydrate intake and postprandial walking duration
  • The reward is defined as the reduction in postprandial glucose levels
  • The online algorithm demonstrated significant improvement over a randomized policy in simulation
  • The simulation experiment validated the online planning approach for personalized recommendations

In a small real-world experiment with 6 participants, the personalized recommendation policy achieved a 23% average improvement in actual glucose responses compared to a randomized policy.

  • The real-world experiment involved 6 participants
  • A simplified version of the proposed method was used with a single update of the recommendation policy into a personalized one
  • A 23% improvement on average in actual glucose responses was observed
  • Improvement was accompanied by behavioral adherence to recommendations concerning carbohydrate intake and postprandial walking

The proposed method uses a two-stage prediction approach that first predicts behavioral responses to an action and subsequently predicts the postprandial glycemic response.

  • The action space is defined as a combination of total carbohydrate intake and postprandial walking duration
  • Reward prediction is implemented in two stages: predicted behavioral responses to an action, followed by postprandial glycemic response
  • This design directly optimizes clinical outcomes (postprandial glucose levels) rather than focusing solely on behavioral changes
  • The approach addresses a gap in prior reinforcement learning studies that focused on behavioral changes while overlooking clinical outcomes

Prior approaches to personalized behavioral recommendations through mobile apps have primarily focused on optimizing behavioral changes using reinforcement learning, overlooking clinical outcomes.

  • Personalized behavioral recommendations through mobile apps have proven effective in preventing serious chronic diseases such as diabetes
  • Recent studies have primarily focused on optimizing personalized recommendations using reinforcement learning
  • The main problem identified with these approaches is that they focus on behavioral changes and overlook clinical outcomes
  • The current study was designed to address this gap by directly optimizing postprandial glucose levels

Further longitudinal real-world experiments in patients with diabetes are needed to validate and generalize the findings.

  • The real-world experiment was small (n=6) and used a simplified version of the proposed method
  • Only a single update of the recommendation policy into a personalized one was tested in the real-world setting
  • The authors noted preliminary effectiveness was demonstrated from both simulation and small real-world experiments
  • Generalizability to patients with diabetes requires further study

Have a question about this study?

Citation

Hotta S, Kytö M, Koivusalo S, Heinonen S, Marttinen P. (2026). Personalized Glucose Management With AI: Pilot Study Using a Multiarmed Bandit Approach.. JMIR formative research. https://doi.org/10.2196/70826