The GutMIND framework integrates shotgun metagenomic data from 31 studies across 12 countries into the largest gut-brain microbiome repository to date, enabling cross-cohort machine learning diagnosis of neuropsychiatric disorders and identification of 9 core neuropsychiatric-protective microbiota linked to glutamate synthesis and acetate production.
Key Findings
Methods
The GutMIND database represents the largest gut-brain microbiome repository to date, integrating 31 studies across 12 countries spanning 14 neuropsychiatric conditions.
Total sample size of n=3,492 participants
Data sourced from 31 studies across 12 countries
Covers 14 neuropsychiatric conditions
Utilized shotgun metagenomic data with harmonized metadata
Adhered to a standardized preprocessing protocol and rigorous quality control workflow
Results
Microbial community heterogeneity was significantly elevated in neuropsychiatric patients compared to healthy controls.
Heterogeneity was characterized across the full GutMIND dataset
This finding applied across multiple neuropsychiatric conditions
The analysis used taxonomic abundance profiles from shotgun metagenomic data
The elevated heterogeneity was detected after standardized preprocessing and quality control
Results
The MetaClassifier framework achieved a mean AUROC of 0.69 across 8 neuropsychiatric disorders in the discovery cohort using nested cross-validation.
This represented the second stage of the two-stage validation strategy
Results confirmed cross-platform generalizability of the MetaClassifier
Results
The Microbial Gut-Brain Axis Health Index (MGBA-HI) effectively distinguished neuropsychiatric status in both the high-quality cohort and the platform-extended cohort.
MGBA-HI was developed as a composite index derived from microbial biomarkers
Validated in the high-quality discovery cohort (n=2,734) and the platform-extended cohort (n=400)
The index was designed to reflect gut-brain axis health
Demonstrated cross-cohort applicability for distinguishing neuropsychiatric from healthy status
Results
Nine core neuropsychiatric-protective microbiota were identified through integrative analysis of health-abundant species, index-derived biomarkers, and ecological prevalence.
Identification relied on integrative analysis combining three evidence streams: health-abundant species, MGBA-HI-derived biomarkers, and ecological prevalence
These 9 species predominantly exhibited metabolic capacities linked to glutamate synthesis and acetate production
The species were characterized as 'neuropsychiatric-protective' based on their associations across multiple disorders
These findings hold translational potential for microbiome-based therapeutic strategies
Methods
The GutMIND framework was designed to minimize technical heterogeneity and ensure robust cross-cohort comparability in gut microbiome-neuropsychiatry research.
Standardized preprocessing protocol applied across all 31 studies
Rigorous quality control workflow implemented to reduce batch effects and methodological fragmentation
Framework addresses limitations of prior single-cohort studies including restricted sample sizes and confounding heterogeneity
Source code and usage instructions for MetaClassifier are publicly accessible at https://github.com/juyanmei/MetaClassifier
Ju Y, Lin S, Hu S, Jin X, Xiao L, Zhang T, et al.. (2026). GutMIND: A multi-cohort machine learning framework for integrative characteristics of the microbiota-gut-brain axis in neuropsychiatric disorders.. Gut microbes. https://doi.org/10.1080/19490976.2026.2630563