Mediation analysis with graph mediator

Performance of estimating target mediation components (⁠|$ \boldsymbol{\theta} $|⁠) and corresponding average indirect effect (⁠|$ \tau_{\text{AIE}} $|⁠) over |$ 200 $| replicates in the simulation study.^a

					\|$ \hat{\boldsymbol{\theta}} $\|	\|$ \hat{\tau}_{\mathrm{AIE}} $\|
Simulation	p	\|$ (n, T) $\|	Method		\|$ \|\langle\hat{\boldsymbol{\theta}},\boldsymbol{\pi}_{j}\rangle\| $\| (SE)	Bias	MSE
				D2	\|$ 0.879 $\| (⁠\|$ 0.113 $\|⁠)	\|$ -0.677 $\|	\|$ 0.460 $\|
			GMed	D4	\|$ 0.929 $\| (⁠\|$ 0.140 $\|⁠)	\|$ -0.653 $\|	\|$ 0.428 $\|
			GMed		D2	\|$ 0.886 $\| (⁠\|$ 0.138 $\|⁠)	\|$ -0.664 $\|	\|$ 0.443 $\|
		\|$ (500,100) $\|	CAP-Med	D4	\|$ 0.905 $\| (⁠\|$ 0.120 $\|⁠)	\|$ -0.635 $\|	\|$ 0.406 $\|
			CAP-Med			D2	\|$ 0.962 $\| (⁠\|$ 0.047 $\|⁠)	\|$ -0.292 $\|	\|$ 0.088 $\|
				GMed	D4	\|$ 0.977 $\| (⁠\|$ 0.096 $\|⁠)	\|$ -0.276 $\|	\|$ 0.080 $\|
				GMed		D2	\|$ 0.911 $\| (⁠\|$ 0.134 $\|⁠)	\|$ -0.243 $\|	\|$ 0.067 $\|
	\|$ 10 $\|	\|$ (500,500) $\|	CAP-Med	D4	\|$ 0.919 $\| (⁠\|$ 0.121 $\|⁠)	\|$ -0.129 $\|	\|$ 0.038 $\|
			CAP-Med				D2	\|$ 0.841 $\| (⁠\|$ 0.101 $\|⁠)	\|$ -0.665 $\|	\|$ 0.444 $\|
					GMed	D4	\|$ 0.910 $\| (⁠\|$ 0.142 $\|⁠)	\|$ -0.659 $\|	\|$ 0.436 $\|
					GMed		D2	\|$ 0.841 $\| (⁠\|$ 0.104 $\|⁠)	\|$ -0.660 $\|	\|$ 0.438 $\|
			\|$ (500,100) $\|	CAP-Med	D4	\|$ 0.898 $\| (⁠\|$ 0.115 $\|⁠)	\|$ -0.632 $\|	\|$ 0.402 $\|
				CAP-Med			D2	\|$ 0.915 $\| (⁠\|$ 0.095 $\|⁠)	\|$ -0.289 $\|	\|$ 0.087 $\|
					GMed	D4	\|$ 0.985 $\| (⁠\|$ 0.073 $\|⁠)	\|$ -0.277 $\|	\|$ 0.080 $\|
					GMed		D2	\|$ 0.905 $\| (⁠\|$ 0.109 $\|⁠)	\|$ -0.253 $\|	\|$ 0.071 $\|
(1)	\|$ 50 $\|	\|$ (500,500) $\|	CAP-Med	D4	\|$ 0.904 $\| (⁠\|$ 0.137 $\|⁠)	\|$ -0.129 $\|	\|$ 0.041 $\|
			CAP-Med					D2	\|$ 0.993 $\| (⁠\|$ 0.005 $\|⁠)	\|$ -0.669 $\|	\|$ 0.449 $\|
						GMed	D4	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.674 $\|	\|$ 0.456 $\|
						GMed		D2	\|$ 0.993 $\| (⁠\|$ 0.005 $\|⁠)	\|$ -1.319 $\|	\|$ 1.863 $\|
				\|$ (500,100) $\|	GMed-Mis	D4	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.114 $\|	\|$ 0.014 $\|
					GMed-Mis			D2	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.291 $\|	\|$ 0.087 $\|
						GMed	D4	\|$ 1.000 $\| (⁠\|$ 0.000 $\|⁠)	\|$ -0.279 $\|	\|$ 0.081 $\|
						GMed		D2	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -1.370 $\|	\|$ 2.076 $\|
	(2)	\|$ 10 $\|	\|$ (500,500) $\|	GMed-Mis	D4	\|$ 1.000 $\| (⁠\|$ 0.000 $\|⁠)	\|$ -0.002 $\|	\|$ 0.001 $\|

					\|$ \hat{\boldsymbol{\theta}} $\|	\|$ \hat{\tau}_{\mathrm{AIE}} $\|
Simulation	p	\|$ (n, T) $\|	Method		\|$ \|\langle\hat{\boldsymbol{\theta}},\boldsymbol{\pi}_{j}\rangle\| $\| (SE)	Bias	MSE
				D2	\|$ 0.879 $\| (⁠\|$ 0.113 $\|⁠)	\|$ -0.677 $\|	\|$ 0.460 $\|
			GMed	D4	\|$ 0.929 $\| (⁠\|$ 0.140 $\|⁠)	\|$ -0.653 $\|	\|$ 0.428 $\|
			GMed		D2	\|$ 0.886 $\| (⁠\|$ 0.138 $\|⁠)	\|$ -0.664 $\|	\|$ 0.443 $\|
		\|$ (500,100) $\|	CAP-Med	D4	\|$ 0.905 $\| (⁠\|$ 0.120 $\|⁠)	\|$ -0.635 $\|	\|$ 0.406 $\|
			CAP-Med			D2	\|$ 0.962 $\| (⁠\|$ 0.047 $\|⁠)	\|$ -0.292 $\|	\|$ 0.088 $\|
				GMed	D4	\|$ 0.977 $\| (⁠\|$ 0.096 $\|⁠)	\|$ -0.276 $\|	\|$ 0.080 $\|
				GMed		D2	\|$ 0.911 $\| (⁠\|$ 0.134 $\|⁠)	\|$ -0.243 $\|	\|$ 0.067 $\|
	\|$ 10 $\|	\|$ (500,500) $\|	CAP-Med	D4	\|$ 0.919 $\| (⁠\|$ 0.121 $\|⁠)	\|$ -0.129 $\|	\|$ 0.038 $\|
			CAP-Med				D2	\|$ 0.841 $\| (⁠\|$ 0.101 $\|⁠)	\|$ -0.665 $\|	\|$ 0.444 $\|
					GMed	D4	\|$ 0.910 $\| (⁠\|$ 0.142 $\|⁠)	\|$ -0.659 $\|	\|$ 0.436 $\|
					GMed		D2	\|$ 0.841 $\| (⁠\|$ 0.104 $\|⁠)	\|$ -0.660 $\|	\|$ 0.438 $\|
			\|$ (500,100) $\|	CAP-Med	D4	\|$ 0.898 $\| (⁠\|$ 0.115 $\|⁠)	\|$ -0.632 $\|	\|$ 0.402 $\|
				CAP-Med			D2	\|$ 0.915 $\| (⁠\|$ 0.095 $\|⁠)	\|$ -0.289 $\|	\|$ 0.087 $\|
					GMed	D4	\|$ 0.985 $\| (⁠\|$ 0.073 $\|⁠)	\|$ -0.277 $\|	\|$ 0.080 $\|
					GMed		D2	\|$ 0.905 $\| (⁠\|$ 0.109 $\|⁠)	\|$ -0.253 $\|	\|$ 0.071 $\|
(1)	\|$ 50 $\|	\|$ (500,500) $\|	CAP-Med	D4	\|$ 0.904 $\| (⁠\|$ 0.137 $\|⁠)	\|$ -0.129 $\|	\|$ 0.041 $\|
			CAP-Med					D2	\|$ 0.993 $\| (⁠\|$ 0.005 $\|⁠)	\|$ -0.669 $\|	\|$ 0.449 $\|
						GMed	D4	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.674 $\|	\|$ 0.456 $\|
						GMed		D2	\|$ 0.993 $\| (⁠\|$ 0.005 $\|⁠)	\|$ -1.319 $\|	\|$ 1.863 $\|
				\|$ (500,100) $\|	GMed-Mis	D4	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.114 $\|	\|$ 0.014 $\|
					GMed-Mis			D2	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.291 $\|	\|$ 0.087 $\|
						GMed	D4	\|$ 1.000 $\| (⁠\|$ 0.000 $\|⁠)	\|$ -0.279 $\|	\|$ 0.081 $\|
						GMed		D2	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -1.370 $\|	\|$ 2.076 $\|
	(2)	\|$ 10 $\|	\|$ (500,500) $\|	GMed-Mis	D4	\|$ 1.000 $\| (⁠\|$ 0.000 $\|⁠)	\|$ -0.002 $\|	\|$ 0.001 $\|

a

SE: standard error; MSE: mean squared error.

Table 1.

Open in new tab Download slide

Performance of estimating target mediation components (⁠|$ \boldsymbol{\theta} $|⁠) and corresponding average indirect effect (⁠|$ \tau_{\text{AIE}} $|⁠) over |$ 200 $| replicates in the simulation study.^a

					\|$ \hat{\boldsymbol{\theta}} $\|	\|$ \hat{\tau}_{\mathrm{AIE}} $\|
Simulation	p	\|$ (n, T) $\|	Method		\|$ \|\langle\hat{\boldsymbol{\theta}},\boldsymbol{\pi}_{j}\rangle\| $\| (SE)	Bias	MSE
				D2	\|$ 0.879 $\| (⁠\|$ 0.113 $\|⁠)	\|$ -0.677 $\|	\|$ 0.460 $\|
			GMed	D4	\|$ 0.929 $\| (⁠\|$ 0.140 $\|⁠)	\|$ -0.653 $\|	\|$ 0.428 $\|
			GMed		D2	\|$ 0.886 $\| (⁠\|$ 0.138 $\|⁠)	\|$ -0.664 $\|	\|$ 0.443 $\|
		\|$ (500,100) $\|	CAP-Med	D4	\|$ 0.905 $\| (⁠\|$ 0.120 $\|⁠)	\|$ -0.635 $\|	\|$ 0.406 $\|
			CAP-Med			D2	\|$ 0.962 $\| (⁠\|$ 0.047 $\|⁠)	\|$ -0.292 $\|	\|$ 0.088 $\|
				GMed	D4	\|$ 0.977 $\| (⁠\|$ 0.096 $\|⁠)	\|$ -0.276 $\|	\|$ 0.080 $\|
				GMed		D2	\|$ 0.911 $\| (⁠\|$ 0.134 $\|⁠)	\|$ -0.243 $\|	\|$ 0.067 $\|
	\|$ 10 $\|	\|$ (500,500) $\|	CAP-Med	D4	\|$ 0.919 $\| (⁠\|$ 0.121 $\|⁠)	\|$ -0.129 $\|	\|$ 0.038 $\|
			CAP-Med				D2	\|$ 0.841 $\| (⁠\|$ 0.101 $\|⁠)	\|$ -0.665 $\|	\|$ 0.444 $\|
					GMed	D4	\|$ 0.910 $\| (⁠\|$ 0.142 $\|⁠)	\|$ -0.659 $\|	\|$ 0.436 $\|
					GMed		D2	\|$ 0.841 $\| (⁠\|$ 0.104 $\|⁠)	\|$ -0.660 $\|	\|$ 0.438 $\|
			\|$ (500,100) $\|	CAP-Med	D4	\|$ 0.898 $\| (⁠\|$ 0.115 $\|⁠)	\|$ -0.632 $\|	\|$ 0.402 $\|
				CAP-Med			D2	\|$ 0.915 $\| (⁠\|$ 0.095 $\|⁠)	\|$ -0.289 $\|	\|$ 0.087 $\|
					GMed	D4	\|$ 0.985 $\| (⁠\|$ 0.073 $\|⁠)	\|$ -0.277 $\|	\|$ 0.080 $\|
					GMed		D2	\|$ 0.905 $\| (⁠\|$ 0.109 $\|⁠)	\|$ -0.253 $\|	\|$ 0.071 $\|
(1)	\|$ 50 $\|	\|$ (500,500) $\|	CAP-Med	D4	\|$ 0.904 $\| (⁠\|$ 0.137 $\|⁠)	\|$ -0.129 $\|	\|$ 0.041 $\|
			CAP-Med					D2	\|$ 0.993 $\| (⁠\|$ 0.005 $\|⁠)	\|$ -0.669 $\|	\|$ 0.449 $\|
						GMed	D4	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.674 $\|	\|$ 0.456 $\|
						GMed		D2	\|$ 0.993 $\| (⁠\|$ 0.005 $\|⁠)	\|$ -1.319 $\|	\|$ 1.863 $\|
				\|$ (500,100) $\|	GMed-Mis	D4	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.114 $\|	\|$ 0.014 $\|
					GMed-Mis			D2	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.291 $\|	\|$ 0.087 $\|
						GMed	D4	\|$ 1.000 $\| (⁠\|$ 0.000 $\|⁠)	\|$ -0.279 $\|	\|$ 0.081 $\|
						GMed		D2	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -1.370 $\|	\|$ 2.076 $\|
	(2)	\|$ 10 $\|	\|$ (500,500) $\|	GMed-Mis	D4	\|$ 1.000 $\| (⁠\|$ 0.000 $\|⁠)	\|$ -0.002 $\|	\|$ 0.001 $\|

					\|$ \hat{\boldsymbol{\theta}} $\|	\|$ \hat{\tau}_{\mathrm{AIE}} $\|
Simulation	p	\|$ (n, T) $\|	Method		\|$ \|\langle\hat{\boldsymbol{\theta}},\boldsymbol{\pi}_{j}\rangle\| $\| (SE)	Bias	MSE
				D2	\|$ 0.879 $\| (⁠\|$ 0.113 $\|⁠)	\|$ -0.677 $\|	\|$ 0.460 $\|
			GMed	D4	\|$ 0.929 $\| (⁠\|$ 0.140 $\|⁠)	\|$ -0.653 $\|	\|$ 0.428 $\|
			GMed		D2	\|$ 0.886 $\| (⁠\|$ 0.138 $\|⁠)	\|$ -0.664 $\|	\|$ 0.443 $\|
		\|$ (500,100) $\|	CAP-Med	D4	\|$ 0.905 $\| (⁠\|$ 0.120 $\|⁠)	\|$ -0.635 $\|	\|$ 0.406 $\|
			CAP-Med			D2	\|$ 0.962 $\| (⁠\|$ 0.047 $\|⁠)	\|$ -0.292 $\|	\|$ 0.088 $\|
				GMed	D4	\|$ 0.977 $\| (⁠\|$ 0.096 $\|⁠)	\|$ -0.276 $\|	\|$ 0.080 $\|
				GMed		D2	\|$ 0.911 $\| (⁠\|$ 0.134 $\|⁠)	\|$ -0.243 $\|	\|$ 0.067 $\|
	\|$ 10 $\|	\|$ (500,500) $\|	CAP-Med	D4	\|$ 0.919 $\| (⁠\|$ 0.121 $\|⁠)	\|$ -0.129 $\|	\|$ 0.038 $\|
			CAP-Med				D2	\|$ 0.841 $\| (⁠\|$ 0.101 $\|⁠)	\|$ -0.665 $\|	\|$ 0.444 $\|
					GMed	D4	\|$ 0.910 $\| (⁠\|$ 0.142 $\|⁠)	\|$ -0.659 $\|	\|$ 0.436 $\|
					GMed		D2	\|$ 0.841 $\| (⁠\|$ 0.104 $\|⁠)	\|$ -0.660 $\|	\|$ 0.438 $\|
			\|$ (500,100) $\|	CAP-Med	D4	\|$ 0.898 $\| (⁠\|$ 0.115 $\|⁠)	\|$ -0.632 $\|	\|$ 0.402 $\|
				CAP-Med			D2	\|$ 0.915 $\| (⁠\|$ 0.095 $\|⁠)	\|$ -0.289 $\|	\|$ 0.087 $\|
					GMed	D4	\|$ 0.985 $\| (⁠\|$ 0.073 $\|⁠)	\|$ -0.277 $\|	\|$ 0.080 $\|
					GMed		D2	\|$ 0.905 $\| (⁠\|$ 0.109 $\|⁠)	\|$ -0.253 $\|	\|$ 0.071 $\|
(1)	\|$ 50 $\|	\|$ (500,500) $\|	CAP-Med	D4	\|$ 0.904 $\| (⁠\|$ 0.137 $\|⁠)	\|$ -0.129 $\|	\|$ 0.041 $\|
			CAP-Med					D2	\|$ 0.993 $\| (⁠\|$ 0.005 $\|⁠)	\|$ -0.669 $\|	\|$ 0.449 $\|
						GMed	D4	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.674 $\|	\|$ 0.456 $\|
						GMed		D2	\|$ 0.993 $\| (⁠\|$ 0.005 $\|⁠)	\|$ -1.319 $\|	\|$ 1.863 $\|
				\|$ (500,100) $\|	GMed-Mis	D4	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.114 $\|	\|$ 0.014 $\|
					GMed-Mis			D2	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -0.291 $\|	\|$ 0.087 $\|
						GMed	D4	\|$ 1.000 $\| (⁠\|$ 0.000 $\|⁠)	\|$ -0.279 $\|	\|$ 0.081 $\|
						GMed		D2	\|$ 0.999 $\| (⁠\|$ 0.001 $\|⁠)	\|$ -1.370 $\|	\|$ 2.076 $\|
	(2)	\|$ 10 $\|	\|$ (500,500) $\|	GMed-Mis	D4	\|$ 1.000 $\| (⁠\|$ 0.000 $\|⁠)	\|$ -0.002 $\|	\|$ 0.001 $\|

a

SE: standard error; MSE: mean squared error.

4. APPLICATION

We apply the proposed approach to data acquired from the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA). The consortium aims to identify the deterministic effect of alcohol use on the developing adolescent brain. A core battery of measurements, including structural and functional brain scans and cognitive testing, has been developed. In adolescence, the sex difference in cognitive behaviors has been identified in different functional domains, including spatial, verbal, math abilities, and social recognition (Esnaola et al. 2020). This difference has been shown to be partially mediated by brain functional connectivity captured by resting-state fMRI (Alarcón et al. 2018). In this study, we aim to identify the brain subnetwork within which functional connectivity mediates the sex difference in cognition. To exclude the impact of alcohol consumption, |$ n\,=\,621 $| subjects (⁠|$ 309 $| Females) aged between |$ 12 $| and |$ 22 $| without excessive drinking are analyzed. Among these, a significantly lower median response time for correct responses in the motor speed test is observed in males compared to females after adjusting for age (⁠|$ \mathrm{ATE}=-0.324 $|⁠, |$ P\text{-value} \lt 0.001 $|⁠), where the motor task measures sensorimotor ability via having the participant use the mouse to click on a shrinking box when it moves to a new position on the screen. We further apply the proposed approach for mediation analysis, where sex is the binary exposure (⁠|$ X $|⁠, |$ X\,=\,1 $| for male), resting-state fMRI is the mediator (⁠|$ \mathbf{M} $|⁠), the z-score of median response time is the outcome (⁠|$ Y $|⁠), and age is the confounding factor (⁠|$ W $|⁠). After preprocessing, fMRI time courses are extracted from |$ p\,=\,75 $| brain regions (⁠|$ 60 $| cortical and |$ 15 $| subcortical regions) spanning the whole brain. These regions are grouped into |$ 10 $| functional modules, which will be used for an ad hoc procedure of sparsifying the loading profile using the fused lasso (Tibshirani et al. 2005). To remove the temporal dependence in the time courses, a subsample is taken with an effective sample size of |$ T_{i}=T\,=\,125 $| and denote the subsampled data as |$ \mathbf{M}_{i}\in\mathbb{R}^{T_{i}\times p} $|⁠, for |$ i\,=\,1 , \dots, n $|⁠.

We first examine the imposed assumptions in Section 2.4 and can conclude that at least the assumption of partial common eigenstructure is satisfied (see Section S4.2). Using the |$ \mathrm{DfD}\leq 2 $| criterion in (2.15), the proposed approach identifies four components. Table 2 presents the estimated AIE, |$ \alpha $| and |$ \beta $| coefficients and the confidence intervals obtained from |$ 500 $| bootstrap samples. The components are identified in order by model fitting. Thus, it is not expected that all have a significant mediation effect. Among these, the third component (⁠|$ C_{3} $|⁠) shows a significantly positive AIE with both |$ \alpha $| and |$ \beta $| negative. Figure 3 presents the sparsified loading profile of |$ \boldsymbol{\theta}_{3} $| and the corresponding brain map. Conducting a sensitivity analysis (Section S4.6), when the sensitivity parameter is positive, that is the unmeasured confounding effect on the mediator and outcome is in the same direction, the conclusion of AIE still holds. Section S4.7 compares the results from the CAP-Med approach introduced in Section 3. The identified component with a significant AIE is highly similar to |$ C_{3} $|⁠. In Section S4.8, a sparse principal component based mediation analysis approach introduced by Zhao et al. (2020) is implemented and the findings are consistent with the findings of the proposed approach. In |$ C_{3} $|⁠, four regions with a non-zero loading are all from the limbic-system network, including the temporal pole (left and right) and the medial orbitofrontal cortex (left and right). Compared to females, (weighted) functional connectivity within this network is lower in males, while this lower functional connectivity increases the reaction time. The temporal pole has been found associated with high-level cognitive functions (Herlin et al. 2021). Though no direct relation to reaction time has been established, an indirect influence was hypothesized by contributing to processes involving decision-making, response selection, and emotion evaluation (Pessoa 2010). One of the primary functions of the medial orbitofrontal cortex is to integrate emotional reaction with sensory and/or contextual stimuli playing a role in reward processing and value-based decision making, which allows individuals to make adaptive responses to stimuli based on emotional significance (Rudebeck and Murray 2014). Thus, indirectly, activation in the area may prolong the reaction time. Regional sex differences of the temporal and frontal cortices have been observed in the developing brain using multiple imaging modalities. Sex difference in the brain was also suggested to be relevant to the symptomatic sex difference in psychiatric disorders (Kaczkurkin et al. 2019). Via a mediation analysis, the proposed approach offers a way of articulating the underlying mechanism.

$The sparsified loading profile and brain map of the component with a significant AIE ($ C_{3} $).$

Fig. 3.

The sparsified loading profile and brain map of the component with a significant AIE (⁠|$ C_{3} $|⁠).

Table 2.

Estimated average indirect effect (AIE) and |$ \alpha $| and |$ \beta $| coefficient of the identified mediator components in the NCANDA dataset.^a

	AIE			\|$ \alpha $\|			\|$ \beta $\|
	Est. (SE)	P-value	\|$ 95\% $\| CI	Est. (SE)	P-value	\|$ 95\% $\| CI	Est. (SE)	P-value	\|$ 95\% $\| CI
\|$ C_{1} $\|	\|$ -0.018 $\| (⁠\|$ 0.031 $\|⁠)	\|$ 0.574 $\|	\|$ (-0.079,0.044) $\|	\|$ -0.325 $\| (⁠\|$ 0.069 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.460,-0.190) $\|	\|$ 0.053 $\| (⁠\|$ 0.094 \hspace{3pt}$\|⁠)	\|$ 0.572 $\|	\|$ (-0.131,0.237) $\|
\|$ C_{2} $\|	\|$ -0.014 $\| (⁠\|$ 0.032 $\|⁠)	\|$ 0.676 $\|	\|$ (-0.077,0.050) $\|	\|$ -0.286 $\| (⁠\|$ 0.065 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.414,-0.157) $\|	\|$ 0.051 $\| (⁠\|$ 0.110 \hspace{2pt}$\|⁠)	\|$ 0.646 $\|	\|$ (-0.166,0.267) $\|
\|$ C_{3} $\|	\|$ 0.066 $\| (⁠\|$ 0.027 \hspace{1pt}$\|⁠)	\|$ 0.014 $\|	\|$ (0.013,0.119) \hspace{1.5pt}$\|	\|$ -0.211 $\| (⁠\|$ 0.064 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.336,-0.086) $\|	\|$ -0.318 $\| (⁠\|$ 0.094 $\|⁠)	\|$ 0.001 $\|	\|$ (-0.503,-0.134) $\|
\|$ C_{4} $\|	\|$ -0.035 $\| (⁠\|$ 0.033 $\|⁠)	\|$ 0.287 $\|	\|$ (-0.100,0.030) $\|	\|$ 0.256 $\| (⁠\|$ 0.042 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (0.173,0.338) $\|	\|$ -0.138 $\| (⁠\|$ 0.127 $\|⁠)	\|$ 0.279 $\|	\|$ (-0.388,0.112) $\|

	AIE			\|$ \alpha $\|			\|$ \beta $\|
	Est. (SE)	P-value	\|$ 95\% $\| CI	Est. (SE)	P-value	\|$ 95\% $\| CI	Est. (SE)	P-value	\|$ 95\% $\| CI
\|$ C_{1} $\|	\|$ -0.018 $\| (⁠\|$ 0.031 $\|⁠)	\|$ 0.574 $\|	\|$ (-0.079,0.044) $\|	\|$ -0.325 $\| (⁠\|$ 0.069 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.460,-0.190) $\|	\|$ 0.053 $\| (⁠\|$ 0.094 \hspace{3pt}$\|⁠)	\|$ 0.572 $\|	\|$ (-0.131,0.237) $\|
\|$ C_{2} $\|	\|$ -0.014 $\| (⁠\|$ 0.032 $\|⁠)	\|$ 0.676 $\|	\|$ (-0.077,0.050) $\|	\|$ -0.286 $\| (⁠\|$ 0.065 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.414,-0.157) $\|	\|$ 0.051 $\| (⁠\|$ 0.110 \hspace{2pt}$\|⁠)	\|$ 0.646 $\|	\|$ (-0.166,0.267) $\|
\|$ C_{3} $\|	\|$ 0.066 $\| (⁠\|$ 0.027 \hspace{1pt}$\|⁠)	\|$ 0.014 $\|	\|$ (0.013,0.119) \hspace{1.5pt}$\|	\|$ -0.211 $\| (⁠\|$ 0.064 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.336,-0.086) $\|	\|$ -0.318 $\| (⁠\|$ 0.094 $\|⁠)	\|$ 0.001 $\|	\|$ (-0.503,-0.134) $\|
\|$ C_{4} $\|	\|$ -0.035 $\| (⁠\|$ 0.033 $\|⁠)	\|$ 0.287 $\|	\|$ (-0.100,0.030) $\|	\|$ 0.256 $\| (⁠\|$ 0.042 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (0.173,0.338) $\|	\|$ -0.138 $\| (⁠\|$ 0.127 $\|⁠)	\|$ 0.279 $\|	\|$ (-0.388,0.112) $\|

a

Confidence intervals are constructed from |$ 500 $| bootstrap samples. Est.: estimate; SE: standard error; CI: confidence interval.

Table 2.

Estimated average indirect effect (AIE) and |$ \alpha $| and |$ \beta $| coefficient of the identified mediator components in the NCANDA dataset.^a

	AIE			\|$ \alpha $\|			\|$ \beta $\|
	Est. (SE)	P-value	\|$ 95\% $\| CI	Est. (SE)	P-value	\|$ 95\% $\| CI	Est. (SE)	P-value	\|$ 95\% $\| CI
\|$ C_{1} $\|	\|$ -0.018 $\| (⁠\|$ 0.031 $\|⁠)	\|$ 0.574 $\|	\|$ (-0.079,0.044) $\|	\|$ -0.325 $\| (⁠\|$ 0.069 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.460,-0.190) $\|	\|$ 0.053 $\| (⁠\|$ 0.094 \hspace{3pt}$\|⁠)	\|$ 0.572 $\|	\|$ (-0.131,0.237) $\|
\|$ C_{2} $\|	\|$ -0.014 $\| (⁠\|$ 0.032 $\|⁠)	\|$ 0.676 $\|	\|$ (-0.077,0.050) $\|	\|$ -0.286 $\| (⁠\|$ 0.065 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.414,-0.157) $\|	\|$ 0.051 $\| (⁠\|$ 0.110 \hspace{2pt}$\|⁠)	\|$ 0.646 $\|	\|$ (-0.166,0.267) $\|
\|$ C_{3} $\|	\|$ 0.066 $\| (⁠\|$ 0.027 \hspace{1pt}$\|⁠)	\|$ 0.014 $\|	\|$ (0.013,0.119) \hspace{1.5pt}$\|	\|$ -0.211 $\| (⁠\|$ 0.064 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.336,-0.086) $\|	\|$ -0.318 $\| (⁠\|$ 0.094 $\|⁠)	\|$ 0.001 $\|	\|$ (-0.503,-0.134) $\|
\|$ C_{4} $\|	\|$ -0.035 $\| (⁠\|$ 0.033 $\|⁠)	\|$ 0.287 $\|	\|$ (-0.100,0.030) $\|	\|$ 0.256 $\| (⁠\|$ 0.042 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (0.173,0.338) $\|	\|$ -0.138 $\| (⁠\|$ 0.127 $\|⁠)	\|$ 0.279 $\|	\|$ (-0.388,0.112) $\|

	AIE			\|$ \alpha $\|			\|$ \beta $\|
	Est. (SE)	P-value	\|$ 95\% $\| CI	Est. (SE)	P-value	\|$ 95\% $\| CI	Est. (SE)	P-value	\|$ 95\% $\| CI
\|$ C_{1} $\|	\|$ -0.018 $\| (⁠\|$ 0.031 $\|⁠)	\|$ 0.574 $\|	\|$ (-0.079,0.044) $\|	\|$ -0.325 $\| (⁠\|$ 0.069 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.460,-0.190) $\|	\|$ 0.053 $\| (⁠\|$ 0.094 \hspace{3pt}$\|⁠)	\|$ 0.572 $\|	\|$ (-0.131,0.237) $\|
\|$ C_{2} $\|	\|$ -0.014 $\| (⁠\|$ 0.032 $\|⁠)	\|$ 0.676 $\|	\|$ (-0.077,0.050) $\|	\|$ -0.286 $\| (⁠\|$ 0.065 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.414,-0.157) $\|	\|$ 0.051 $\| (⁠\|$ 0.110 \hspace{2pt}$\|⁠)	\|$ 0.646 $\|	\|$ (-0.166,0.267) $\|
\|$ C_{3} $\|	\|$ 0.066 $\| (⁠\|$ 0.027 \hspace{1pt}$\|⁠)	\|$ 0.014 $\|	\|$ (0.013,0.119) \hspace{1.5pt}$\|	\|$ -0.211 $\| (⁠\|$ 0.064 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (-0.336,-0.086) $\|	\|$ -0.318 $\| (⁠\|$ 0.094 $\|⁠)	\|$ 0.001 $\|	\|$ (-0.503,-0.134) $\|
\|$ C_{4} $\|	\|$ -0.035 $\| (⁠\|$ 0.033 $\|⁠)	\|$ 0.287 $\|	\|$ (-0.100,0.030) $\|	\|$ 0.256 $\| (⁠\|$ 0.042 $\|⁠)	\|$ \lt 0.001 $\|	\|$ (0.173,0.338) $\|	\|$ -0.138 $\| (⁠\|$ 0.127 $\|⁠)	\|$ 0.279 $\|	\|$ (-0.388,0.112) $\|

a

Confidence intervals are constructed from |$ 500 $| bootstrap samples. Est.: estimate; SE: standard error; CI: confidence interval.

5. DISCUSSION

This study introduces a mediation analysis framework when the mediator is a graph. A Gaussian covariance graph model is assumed for graph representation. Causal estimands and assumptions are discussed. With a covariance matrix as the mediator, parametric mediation models are introduced based on matrix decomposition. Assuming Gaussian random errors, likelihood-based estimators are proposed to simultaneously identify the decomposition and causal parameters. An efficient computational algorithm is proposed and the asymptotic properties are investigated. Via simulation studies, the performance of the proposed approach is evaluated. Applying to a resting-state fMRI study, a brain network is identified within which functional connectivity mediates the sex difference in the performance of a motor task.

In causal mediation analysis, an essential while untestable assumption is assuming no unmeasured mediator-outcome confounding. A sensitivity analysis is usually conducted to justify the validity of the conclusion to this assumption. For parametric approaches, one type of commonly used approach is to parametrize the confounding effect to evaluate the causal effects under various values, such as the one proposed in Imai et al. (2010). Using simulation studies, it is demonstrated that the proposed approach is robust to the existence of unmeasured mediator-outcome confounding in identifying mediation components, |$ \boldsymbol{\theta} $|⁠. With given |$ \boldsymbol{\theta} $|⁠, one can employ the approach in Imai et al. (2010) for sensitivity analysis.

The consistency of the proposed estimator requires the common diagonalization assumption on the covariance matrices. Via simulation studies, Zhao et al. (2021) pointed out that this assumption can be relaxed to partial common diagonalization. The robustness of the proposed approach to this relaxation is also verified by simulation studies presented in Section S3.2. The proposed framework assumes that the components unravel the mediating role of the graph mediator. This low-rank representation may potentially underfit the data. An evaluation of model fitting for covariance regression is not yet available, nor for mediation analysis. We leave the investigation of goodness of fit to future research.

Considering a graph mediator, under the Gaussian covariance graph model, this study assumes the number of nodes in the graph is fixed and low dimensional. The sample covariance matrices are thus well-conditioned and a likelihood-based approach is introduced to estimate model parameters. In many practical settings, for example, in voxel-level fMRI analysis, data dimension can be even higher than the number of fMRI data points. A well-conditioned estimator of the covariance matrix is required and we leave the introduction of such an estimator and the study of theoretical results as one of future directions. As discussed in Section 2.3, inference on projection vectors is not straightforward and requires rigorous theoretical and numerical investigations, which we leave to future research.

ACKNOWLEDGMENTS

Data presented in this study were downloaded from the ongoing National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA) study.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Biostatistics Journal online.

FUNDING

This work was partially supported by NIH grants [R01MH126970, P30AG072976, and U54AG065181 to Y.Z.].

CONFLICT OF INTEREST

None declared.

DATA AVAILABILITY

Software in the form of R code, together with a sample input dataset and complete documentation is available on GitHub website https://github.com/zhaoyi1026/GMed.

References

Alarcón

G

,

Pfeifer

JH

,

Fair

DA

,

Nagel

BJ.

2018

.

Adolescent gender differences in cognitive control performance and functional connectivity between default mode and fronto-parietal networks within a self-referential context

.

Front Behav Neurosci

.

12

:

73

.

Anderson

TW.

1973

.

Asymptotically efficient estimation of covariance matrices with linear structure

.

Ann Statist

.

1

:

135

–

141

.

Andrews

RM

,

Didelez

V.

2021

.

Insights into the cross-world independence assumption of causal mediation analysis

.

Epidemiology

.

32

:

209

–

219

.

Baron

RM

,

Kenny

DA.

1986

.

The moderator–mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations

.

J Pers Soc Psychol

.

51

:

1173

–

1182

.

Boik

RJ.

2002

.

Spectral models for covariance matrices

.

Biometrika

.

89

:

159

–

182

.

Chaudhuri

S

,

Drton

M

,

Richardson

TS.

2007

.

Estimation of a covariance matrix with zeros

.

Biometrika

.

94

:

199

–

216

.

Che

C

,

Jin

IH

,

Zhang

Z.

2021

.

Network mediation analysis using model-based eigenvalue decomposition

.

Struct Eqn Model Multidiscip J

.

28

:

148

–

161

.

Chen

M

,

Zhou

Y.

2024

. Causal mediation analysis with a three-dimensional image mediator. Statistics in Medicine.

43

:

2869

–

2893

.

Chén

OY

et al.

2018

.

High-dimensional multivariate mediation with application to neuroimaging data

.

Biostatistics

.

19

:

121

–

136

.

Chiu

TY

,

Leonard

T

,

Tsui

K-W.

1996

.

The matrix-logarithmic covariance model

.

J Am Stat Assoc

.

91

:

198

–

210

.

Derkach

A

,

Pfeiffer

RM

,

Chen

T-H

,

Sampson

JN.

2019

.

High dimensional mediation analysis with latent variables

.

Biometrics

.

75

:

745

–

756

.

Edwards

D.

2012

.

Introduction to graphical modelling

.

Springer Science & Business Media

.

Google Preview

Esnaola

I

,

Sesé

A

,

Antonio-Agirre

I

,

Azpiazu

L.

2020

.

The development of multiple self-concept dimensions during adolescence

.

J Res Adolesc

.

30

:

100

–

114

.

Flury

BN.

1984

.

Common principal components in |$ k $| groups

.

J Am Stat Assoc

.

79

:

892

–

898

.

Franks

AM

,

Hoff

P.

2019

.

Shared subspace models for multi-group covariance estimation

.

J Mach Learn Res

.

20

:

1

–

37

.

Friston

KJ.

2011

.

Functional and effective connectivity: a review

.

Brain Connect

.

1

:

13

–

36

.

Gu

F

,

Preacher

KJ

,

Ferrer

E.

2014

.

A state space modeling approach to mediation analysis

.

J Educ Behav Stat

.

39

:

117

–

143

.

Herlin

B

,

Navarro

V

,

Dupont

S.

2021

.

The temporal pole: from anatomy to function—a literature appraisal

.

J Chem Neuroanat

.

113

:

101925

.

Hoff

PD.

2009

.

A hierarchical eigenmodel for pooled covariance estimation

.

J R Stat Soc Ser B (Stat Methodol)

.

71

:

971

–

992

.

Hoff

PD

,

Niu

X.

2012

.

A covariance regression model

.

Stat Sinica

.

22

:

729

–

753

.

Huang

Y-T

,

Pan

W-C.

2016

.

Hypothesis test of mediation effect in causal mediation model with high-dimensional continuous mediators

.

Biometrics

.

72

:

402

–

413

.

Imai

K

,

Keele

L

,

Yamamoto

T.

2010

.

Identification, inference and sensitivity analysis for causal mediation effects

.

Statist Sci

.

25

:

51

–

71

.

Jiang

S

,

Colditz

GA.

2023

.

Causal mediation analysis using high-dimensional image mediator bounded in irregular domain with an application to breast cancer

.

Biometrics

.

79

:

3728

–

3738

.

Kaczkurkin

AN

,

Raznahan

A

,

Satterthwaite

TD.

2019

.

Sex differences in the developing brain: insights from multimodal neuroimaging

.

Neuropsychopharmacology

.

44

:

71

–

85

.

Krzanowski

WJ.

1984

.

Principal component analysis in the presence of group structure

.

J R Stat Soc Ser C (Appl Stat)

.

33

:

164

–

168

.

Lee

Y

,

Nelder

JA.

1996

.

Hierarchical generalized linear models

.

J R Stat Soc Ser B (Methodological)

.

58

:

619

–

656

.

Lindquist

MA.

2008

.

The statistical analysis of fMRI data

.

Statist Sci

.

23

:

439

–

464

.

Lindquist

MA.

2012

.

Functional causal mediation analysis with an application to brain connectivity

.

J Am Stat Assoc

.

107

:

1297

–

1309

.

Liu

H

,

Jin

IH

,

Zhang

Z

,

Yuan

Y.

2021

.

Social network mediation analysis: a latent space approach

.

Psychometrika

.

86

:

272

–

298

.

Pearl

J.

(

2001

). Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc. p.

411

–

420

.

Pervaiz

U

,

Vidaurre

D

,

Woolrich

MW

,

Smith

SM.

2020

.

Optimising network modelling methods for fMRI

.

Neuroimage

.

211

:

116604

.

Pessoa

L.

2010

.

Emotion and cognition and the amygdala: from “what is it?” to “what’s to be done?”

.

Neuropsychologia

.

48

:

3416

–

3429

.

Pourahmadi

M

,

Daniels

MJ

,

Park

T.

2007

.

Simultaneous modelling of the Cholesky decomposition of several covariance matrices

.

J Multivariate Anal

.

98

:

568

–

587

.

Richardson

T

,

Spirtes

P.

2002

.

Ancestral graph Markov models

.

Ann Stat

.

962

–

1030

.

Richardson

TS

,

Robins

JM.

2013

.

Single world intervention graphs (swigs): a unification of the counterfactual and graphical approaches to causality

. Center for the Statistics and the Social Sciences, University of Washington Series. Working Paper. p.

128

.

Robins

JM

,

Greenland

S.

1992

.

Identifiability and exchangeability for direct and indirect effects

.

Epidemiology

.

3

:

143

–

155

.

Robins

JM

,

Richardson

TS.

2010

.

Alternative graphical causal models and the identification of direct effects

.

Causal Psychopathol Finding Determinants Disorders Cures

.

103

–

158

.

Rubin

DB.

1980

.

Randomization analysis of experimental data: the Fisher randomization test comment

.

J Am Stat Assoc

.

75

:

591

–

593

.

Rudebeck

PH

,

Murray

EA.

2014

.

The orbitofrontal oracle: cortical mechanisms for the prediction and evaluation of specific behavioral outcomes

.

Neuron

.

84

:

1143

–

1156

.

Seiler

C

,

Holmes

S.

2017

.

Multivariate heteroscedasticity models for functional brain connectivity

.

Front Neurosci

.

11

:

696

.

Tibshirani

R

,

Saunders

M

,

Rosset

S

,

Zhu

J

,

Knight

K.

2005

.

Sparsity and smoothness via the fused lasso

.

J R Stat Soc Ser B (Stat Methodol)

.

67

:

91

–

108

.

VanderWeele

TJ.

2016

.

Mediation analysis: a practitioner’s guide

.

Annu Rev Public Health

.

37

:

17

–

32

.

Zeng

S

,

Rosenbaum

S

,

Alberts

SC

,

Archie

EA

,

Li

F.

2021

.

Causal mediation analysis for sparse and irregular longitudinal data

.

Ann Appl Stat

.

15

:

747

–

767

.

Zhao

Y

et al.

2022

.

Bayesian network mediation analysis with application to the brain functional connectome

.

Stat Med

.

41

:

3991

–

4005

.

Zhao

Y

,

Lindquist

MA

,

Caffo

BS.

2020

.

Sparse principal component based high-dimensional mediation analysis

.

Comput Stat Data Anal

.

142

:

106835

.

Zhao

Y

,

Luo

X.

2019

.

Granger mediation analysis of multiple time series with an application to functional magnetic resonance imaging

.

Biometrics

.

75

:

788

–

798

.

Zhao

Y

,

Luo

X.

2022

.

Pathway lasso: pathway estimation and selection with high dimensional mediators

.

Stat Interface

.

15

:

39

–

50

.

Zhao

Y

,

Luo

X

,

Lindquist

M

,

Caffo

B.

(

2018

). Functional mediation analysis with an application to functional magnetic resonance imaging data. arXiv preprint arXiv:1805.06923.

Zhao

Y

,

Wang

B

,

Mostofsky

SH

,

Caffo

BS

,

Luo

XI.

2021

.

Covariate assisted principal regression for covariance matrix outcomes

.

Biostatistics

.

22

:

629

–

645

.

Zou

T

,

Lan

W

,

Wang

H

,

Tsai

C-L.

2017

.

Covariance regression analysis

.

J Am Stat Assoc

.

112

:

266

–

281

.