Group Variable Selection for the Cox Model with Interval-Censored Failure Time Data

Simulation results based on current status data under the first covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.831	3.744	1.01	1.872	0.062
Group LASSO	0.884	3.316	1.358	1.658	0.282
Group ALASSO	0.671	3.852	1.592	1.926	0.256
Group MCP	0.911	3.656	1.102	1.828	0.124
Group SCAD	0.901	3.644	1.292	1.822	0.206
Group SELO	0.709	3.708	0.986	1.854	0.054
Group SICA	0.617	3.744	1.138	1.872	0.106


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.831	3.744	1.01	1.872	0.062
Group LASSO	0.884	3.316	1.358	1.658	0.282
Group ALASSO	0.671	3.852	1.592	1.926	0.256
Group MCP	0.911	3.656	1.102	1.828	0.124
Group SCAD	0.901	3.644	1.292	1.822	0.206
Group SELO	0.709	3.708	0.986	1.854	0.054
Group SICA	0.617	3.744	1.138	1.872	0.106


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.667	3.784	1.012	1.892	0.054
Group LASSO	0.805	3.588	1.35	1.794	0.236
Group ALASSO	0.606	3.88	1.488	1.94	0.202
Group MCP	0.796	3.692	1.11	1.846	0.116
Group SCAD	0.836	3.612	1.13	1.806	0.14
Group SELO	0.599	3.78	1.046	1.89	0.07
Group SICA	0.588	3.748	1.062	1.874	0.082


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.667	3.784	1.012	1.892	0.054
Group LASSO	0.805	3.588	1.35	1.794	0.236
Group ALASSO	0.606	3.88	1.488	1.94	0.202
Group MCP	0.796	3.692	1.11	1.846	0.116
Group SCAD	0.836	3.612	1.13	1.806	0.14
Group SELO	0.599	3.78	1.046	1.89	0.07
Group SICA	0.588	3.748	1.062	1.874	0.082

Table 1

Simulation results based on current status data under the first covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.831	3.744	1.01	1.872	0.062
Group LASSO	0.884	3.316	1.358	1.658	0.282
Group ALASSO	0.671	3.852	1.592	1.926	0.256
Group MCP	0.911	3.656	1.102	1.828	0.124
Group SCAD	0.901	3.644	1.292	1.822	0.206
Group SELO	0.709	3.708	0.986	1.854	0.054
Group SICA	0.617	3.744	1.138	1.872	0.106


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.831	3.744	1.01	1.872	0.062
Group LASSO	0.884	3.316	1.358	1.658	0.282
Group ALASSO	0.671	3.852	1.592	1.926	0.256
Group MCP	0.911	3.656	1.102	1.828	0.124
Group SCAD	0.901	3.644	1.292	1.822	0.206
Group SELO	0.709	3.708	0.986	1.854	0.054
Group SICA	0.617	3.744	1.138	1.872	0.106


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.667	3.784	1.012	1.892	0.054
Group LASSO	0.805	3.588	1.35	1.794	0.236
Group ALASSO	0.606	3.88	1.488	1.94	0.202
Group MCP	0.796	3.692	1.11	1.846	0.116
Group SCAD	0.836	3.612	1.13	1.806	0.14
Group SELO	0.599	3.78	1.046	1.89	0.07
Group SICA	0.588	3.748	1.062	1.874	0.082


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.667	3.784	1.012	1.892	0.054
Group LASSO	0.805	3.588	1.35	1.794	0.236
Group ALASSO	0.606	3.88	1.488	1.94	0.202
Group MCP	0.796	3.692	1.11	1.846	0.116
Group SCAD	0.836	3.612	1.13	1.806	0.14
Group SELO	0.599	3.78	1.046	1.89	0.07
Group SICA	0.588	3.748	1.062	1.874	0.082

Table 2

Simulation results based on general interval-censored data under the first covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.471	3.976	1.088	1.988	0.042
Group LASSO	0.666	3.94	1.836	1.97	0.37
Group ALASSO	0.463	3.992	1.342	1.996	0.132
Group MCP	0.53	3.956	1.15	1.978	0.082
Group SCAD	0.512	3.944	1.216	1.972	0.108
Group SELO	0.444	3.968	1.102	1.984	0.052
Group SICA	0.446	3.984	1.114	1.992	0.054


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.471	3.976	1.088	1.988	0.042
Group LASSO	0.666	3.94	1.836	1.97	0.37
Group ALASSO	0.463	3.992	1.342	1.996	0.132
Group MCP	0.53	3.956	1.15	1.978	0.082
Group SCAD	0.512	3.944	1.216	1.972	0.108
Group SELO	0.444	3.968	1.102	1.984	0.052
Group SICA	0.446	3.984	1.114	1.992	0.054


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.452	3.974	1.086	1.988	0.046
Group LASSO	0.68	3.872	1.688	1.936	0.32
Group ALASSO	0.461	3.976	1.2	1.988	0.09
Group MCP	0.489	3.984	1.172	1.992	0.086
Group SCAD	0.498	3.932	1.218	1.966	0.112
Group SELO	0.437	3.972	1.088	1.986	0.042
Group SICA	0.448	3.952	1.126	1.976	0.066


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.452	3.974	1.086	1.988	0.046
Group LASSO	0.68	3.872	1.688	1.936	0.32
Group ALASSO	0.461	3.976	1.2	1.988	0.09
Group MCP	0.489	3.984	1.172	1.992	0.086
Group SCAD	0.498	3.932	1.218	1.966	0.112
Group SELO	0.437	3.972	1.088	1.986	0.042
Group SICA	0.448	3.952	1.126	1.976	0.066

Table 2

Simulation results based on general interval-censored data under the first covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.471	3.976	1.088	1.988	0.042
Group LASSO	0.666	3.94	1.836	1.97	0.37
Group ALASSO	0.463	3.992	1.342	1.996	0.132
Group MCP	0.53	3.956	1.15	1.978	0.082
Group SCAD	0.512	3.944	1.216	1.972	0.108
Group SELO	0.444	3.968	1.102	1.984	0.052
Group SICA	0.446	3.984	1.114	1.992	0.054


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.471	3.976	1.088	1.988	0.042
Group LASSO	0.666	3.94	1.836	1.97	0.37
Group ALASSO	0.463	3.992	1.342	1.996	0.132
Group MCP	0.53	3.956	1.15	1.978	0.082
Group SCAD	0.512	3.944	1.216	1.972	0.108
Group SELO	0.444	3.968	1.102	1.984	0.052
Group SICA	0.446	3.984	1.114	1.992	0.054


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.452	3.974	1.086	1.988	0.046
Group LASSO	0.68	3.872	1.688	1.936	0.32
Group ALASSO	0.461	3.976	1.2	1.988	0.09
Group MCP	0.489	3.984	1.172	1.992	0.086
Group SCAD	0.498	3.932	1.218	1.966	0.112
Group SELO	0.437	3.972	1.088	1.986	0.042
Group SICA	0.448	3.952	1.126	1.976	0.066


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.452	3.974	1.086	1.988	0.046
Group LASSO	0.68	3.872	1.688	1.936	0.32
Group ALASSO	0.461	3.976	1.2	1.988	0.09
Group MCP	0.489	3.984	1.172	1.992	0.086
Group SCAD	0.498	3.932	1.218	1.966	0.112
Group SELO	0.437	3.972	1.088	1.986	0.042
Group SICA	0.448	3.952	1.126	1.976	0.066

Tables 3 and 4 give the results obtained by the proposed variable selection procedure based on interval-censored data and under the second and third settings for covariates, respectively, with the other setups being the same as in Table 2. It is apparent that they are similar to those given in Table 2 and again suggest the proposed method performed well. In addition, it seems that the group BAR gave much better or superior performance than the other group penalty functions in terms of RMSE, FP individual, and FP group.

Table 3

Simulation results based on general interval-censored data under the second covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.293	4	1.108	2	0.052
Group LASSO	4.598	3.992	6.632	1.996	2.222
Group ALASSO	3.287	4	6.902	2	2.338
Group MCP	2.004	4	2.078	2	0.416
Group SCAD	2.105	4	2.29	2	0.496
Group SELO	2.237	4	4.194	2	1.192
Group SICA	2.542	4	4.63	2	1.372


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.293	4	1.108	2	0.052
Group LASSO	4.598	3.992	6.632	1.996	2.222
Group ALASSO	3.287	4	6.902	2	2.338
Group MCP	2.004	4	2.078	2	0.416
Group SCAD	2.105	4	2.29	2	0.496
Group SELO	2.237	4	4.194	2	1.192
Group SICA	2.542	4	4.63	2	1.372


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.474	3.998	1.288	2	0.124
Group LASSO	3.504	4	7.276	2	2.478
Group ALASSO	2.108	4	2.074	2	0.386
Group MCP	2.247	4	2.456	2	0.56
Group SCAD	2.447	4	2.854	2	0.718
Group SELO	2.482	4	4.184	2	1.178
Group SICA	2.821	4	4.572	2	1.352


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.474	3.998	1.288	2	0.124
Group LASSO	3.504	4	7.276	2	2.478
Group ALASSO	2.108	4	2.074	2	0.386
Group MCP	2.247	4	2.456	2	0.56
Group SCAD	2.447	4	2.854	2	0.718
Group SELO	2.482	4	4.184	2	1.178
Group SICA	2.821	4	4.572	2	1.352

Table 3

Simulation results based on general interval-censored data under the second covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.293	4	1.108	2	0.052
Group LASSO	4.598	3.992	6.632	1.996	2.222
Group ALASSO	3.287	4	6.902	2	2.338
Group MCP	2.004	4	2.078	2	0.416
Group SCAD	2.105	4	2.29	2	0.496
Group SELO	2.237	4	4.194	2	1.192
Group SICA	2.542	4	4.63	2	1.372


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.293	4	1.108	2	0.052
Group LASSO	4.598	3.992	6.632	1.996	2.222
Group ALASSO	3.287	4	6.902	2	2.338
Group MCP	2.004	4	2.078	2	0.416
Group SCAD	2.105	4	2.29	2	0.496
Group SELO	2.237	4	4.194	2	1.192
Group SICA	2.542	4	4.63	2	1.372


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.474	3.998	1.288	2	0.124
Group LASSO	3.504	4	7.276	2	2.478
Group ALASSO	2.108	4	2.074	2	0.386
Group MCP	2.247	4	2.456	2	0.56
Group SCAD	2.447	4	2.854	2	0.718
Group SELO	2.482	4	4.184	2	1.178
Group SICA	2.821	4	4.572	2	1.352


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.474	3.998	1.288	2	0.124
Group LASSO	3.504	4	7.276	2	2.478
Group ALASSO	2.108	4	2.074	2	0.386
Group MCP	2.247	4	2.456	2	0.56
Group SCAD	2.447	4	2.854	2	0.718
Group SELO	2.482	4	4.184	2	1.178
Group SICA	2.821	4	4.572	2	1.352

Table 4

Simulation results based on general interval-censored data under the third covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.465	3.996	1.162	2	0.068
Group LASSO	2.879	3.992	6.63	1.996	2.106
Group ALASSO	1.867	4	2.078	2	0.388
Group MCP	1.705	4	2.286	2	0.444
Group SCAD	1.645	4	2.824	2	0.624
Group SELO	2.064	4	3.75	2	0.94
Group SICA	2.082	4	4.118	2	1.076


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.465	3.996	1.162	2	0.068
Group LASSO	2.879	3.992	6.63	1.996	2.106
Group ALASSO	1.867	4	2.078	2	0.388
Group MCP	1.705	4	2.286	2	0.444
Group SCAD	1.645	4	2.824	2	0.624
Group SELO	2.064	4	3.75	2	0.94
Group SICA	2.082	4	4.118	2	1.076


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.339	4	1.12	2	0.052
Group LASSO	2.853	3.996	6.592	1.998	2.09
Group ALASSO	2.071	4	2.242	2	0.456
Group MCP	1.779	4	2.746	2	0.6
Group SCAD	1.848	4	3.082	2	0.716
Group SELO	2.024	4	3.726	2	0.94
Group SICA	2.247	4	4.024	2	1.04


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.339	4	1.12	2	0.052
Group LASSO	2.853	3.996	6.592	1.998	2.09
Group ALASSO	2.071	4	2.242	2	0.456
Group MCP	1.779	4	2.746	2	0.6
Group SCAD	1.848	4	3.082	2	0.716
Group SELO	2.024	4	3.726	2	0.94
Group SICA	2.247	4	4.024	2	1.04

Table 4

Simulation results based on general interval-censored data under the third covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.465	3.996	1.162	2	0.068
Group LASSO	2.879	3.992	6.63	1.996	2.106
Group ALASSO	1.867	4	2.078	2	0.388
Group MCP	1.705	4	2.286	2	0.444
Group SCAD	1.645	4	2.824	2	0.624
Group SELO	2.064	4	3.75	2	0.94
Group SICA	2.082	4	4.118	2	1.076


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.465	3.996	1.162	2	0.068
Group LASSO	2.879	3.992	6.63	1.996	2.106
Group ALASSO	1.867	4	2.078	2	0.388
Group MCP	1.705	4	2.286	2	0.444
Group SCAD	1.645	4	2.824	2	0.624
Group SELO	2.064	4	3.75	2	0.94
Group SICA	2.082	4	4.118	2	1.076


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.339	4	1.12	2	0.052
Group LASSO	2.853	3.996	6.592	1.998	2.09
Group ALASSO	2.071	4	2.242	2	0.456
Group MCP	1.779	4	2.746	2	0.6
Group SCAD	1.848	4	3.082	2	0.716
Group SELO	2.024	4	3.726	2	0.94
Group SICA	2.247	4	4.024	2	1.04


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.339	4	1.12	2	0.052
Group LASSO	2.853	3.996	6.592	1.998	2.09
Group ALASSO	2.071	4	2.242	2	0.456
Group MCP	1.779	4	2.746	2	0.6
Group SCAD	1.848	4	3.082	2	0.716
Group SELO	2.024	4	3.726	2	0.94
Group SICA	2.247	4	4.024	2	1.04

In the above, the observed data have about 25% of right-censored observations and suggested by a reviewer, we also investigated the situation with about 50% of right-censored observations with the obtained results given in Table 5. Here the other setups are the same as in Table 2, and one can see that they basically gave the same conclusions as above and again indicate that the proposed procedure gave good performance. In particular, the relationship of different penalty functions in terms of their performance is the same as before and the group BAR provided the best choice among these considered from the TP and FP points of view.

Table 5

Simulation results based on general interval-censored data under the first covariate setting with formula ⁠, formula ⁠, and formula and 50% right censoring rate.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.468	3.98	1.07	1.99	0.038
Group LASSO	0.643	3.856	1.582	1.928	0.278
Group ALASSO	0.468	3.96	1.272	1.98	0.112
Group MCP	0.511	3.924	1.198	1.962	0.104
Group SCAD	0.543	3.896	1.264	1.948	0.144
Group SELO	0.434	3.98	1.08	1.99	0.04
Group SICA	0.446	3.948	1.114	1.974	0.06


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.468	3.98	1.07	1.99	0.038
Group LASSO	0.643	3.856	1.582	1.928	0.278
Group ALASSO	0.468	3.96	1.272	1.98	0.112
Group MCP	0.511	3.924	1.198	1.962	0.104
Group SCAD	0.543	3.896	1.264	1.948	0.144
Group SELO	0.434	3.98	1.08	1.99	0.04
Group SICA	0.446	3.948	1.114	1.974	0.06


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.478	3.944	1.062	1.972	0.042
Group LASSO	0.662	3.812	1.642	1.906	0.314
Group ALASSO	0.498	3.916	1.276	1.958	0.124
Group MCP	0.494	3.908	1.158	1.954	0.092
Group SCAD	0.558	3.832	1.34	1.916	0.19
Group SELO	0.468	3.928	1.09	1.964	0.056
Group SICA	0.466	3.888	1.07	1.944	0.058


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.478	3.944	1.062	1.972	0.042
Group LASSO	0.662	3.812	1.642	1.906	0.314
Group ALASSO	0.498	3.916	1.276	1.958	0.124
Group MCP	0.494	3.908	1.158	1.954	0.092
Group SCAD	0.558	3.832	1.34	1.916	0.19
Group SELO	0.468	3.928	1.09	1.964	0.056
Group SICA	0.466	3.888	1.07	1.944	0.058

Table 5

Simulation results based on general interval-censored data under the first covariate setting with formula ⁠, formula ⁠, and formula and 50% right censoring rate.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.468	3.98	1.07	1.99	0.038
Group LASSO	0.643	3.856	1.582	1.928	0.278
Group ALASSO	0.468	3.96	1.272	1.98	0.112
Group MCP	0.511	3.924	1.198	1.962	0.104
Group SCAD	0.543	3.896	1.264	1.948	0.144
Group SELO	0.434	3.98	1.08	1.99	0.04
Group SICA	0.446	3.948	1.114	1.974	0.06


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.468	3.98	1.07	1.99	0.038
Group LASSO	0.643	3.856	1.582	1.928	0.278
Group ALASSO	0.468	3.96	1.272	1.98	0.112
Group MCP	0.511	3.924	1.198	1.962	0.104
Group SCAD	0.543	3.896	1.264	1.948	0.144
Group SELO	0.434	3.98	1.08	1.99	0.04
Group SICA	0.446	3.948	1.114	1.974	0.06


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.478	3.944	1.062	1.972	0.042
Group LASSO	0.662	3.812	1.642	1.906	0.314
Group ALASSO	0.498	3.916	1.276	1.958	0.124
Group MCP	0.494	3.908	1.158	1.954	0.092
Group SCAD	0.558	3.832	1.34	1.916	0.19
Group SELO	0.468	3.928	1.09	1.964	0.056
Group SICA	0.466	3.888	1.07	1.944	0.058


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	0.478	3.944	1.062	1.972	0.042
Group LASSO	0.662	3.812	1.642	1.906	0.314
Group ALASSO	0.498	3.916	1.276	1.958	0.124
Group MCP	0.494	3.908	1.158	1.954	0.092
Group SCAD	0.558	3.832	1.34	1.916	0.19
Group SELO	0.468	3.928	1.09	1.964	0.056
Group SICA	0.466	3.888	1.07	1.944	0.058

To investigate the performance of the proposed method in high-dimensional situations, we repeated the study giving the results in Table 4 by setting with the first 20 covariates being continuous and the other being discrete and both types of covariates being generated in the same way as with Table 4. The obtained variable selection results are provided in Table 6 with the true value of being and ⁠. That is, we have that with eight nonzero covariates. Although the overall conclusions are similar to those given in Table 4, it seems that the proposed method with the group BAR gave much more stable results in terms of both TP individual and TP group than with the other group penalty functions. We also considered other setups such as and obtained similar results.

Table 6

Simulation results based on general interval-censored data under the third covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.077	7.986	2.322	4	0.184
Group LASSO	3.862	7.4	12.28	3.7	4.144
Group ALASSO	2.376	7.992	3.524	3.996	0.534
Group MCP	1.528	7.728	5.828	3.864	1.51
Group SCAD	1.383	7.688	6.986	3.844	1.958
Group SELO	2.47	7.784	6.546	3.892	1.732
Group SICA	2.804	7.676	7.936	3.838	2.318


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.077	7.986	2.322	4	0.184
Group LASSO	3.862	7.4	12.28	3.7	4.144
Group ALASSO	2.376	7.992	3.524	3.996	0.534
Group MCP	1.528	7.728	5.828	3.864	1.51
Group SCAD	1.383	7.688	6.986	3.844	1.958
Group SELO	2.47	7.784	6.546	3.892	1.732
Group SICA	2.804	7.676	7.936	3.838	2.318


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.109	7.99	2.436	3.996	0.238
Group LASSO	3.994	7.216	11.646	3.608	3.918
Group ALASSO	2.531	7.964	3.738	3.982	0.624
Group MCP	1.72	7.568	5.994	3.784	1.62
Group SCAD	1.667	7.528	7.468	3.764	2.192
Group SELO	2.72	7.644	6.312	3.822	1.672
Group SICA	2.806	7.604	8.304	3.802	2.478


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.109	7.99	2.436	3.996	0.238
Group LASSO	3.994	7.216	11.646	3.608	3.918
Group ALASSO	2.531	7.964	3.738	3.982	0.624
Group MCP	1.72	7.568	5.994	3.784	1.62
Group SCAD	1.667	7.528	7.468	3.764	2.192
Group SELO	2.72	7.644	6.312	3.822	1.672
Group SICA	2.806	7.604	8.304	3.802	2.478

Table 6

Simulation results based on general interval-censored data under the third covariate setting with formula ⁠, formula ⁠, and formula ⁠.


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.077	7.986	2.322	4	0.184
Group LASSO	3.862	7.4	12.28	3.7	4.144
Group ALASSO	2.376	7.992	3.524	3.996	0.534
Group MCP	1.528	7.728	5.828	3.864	1.51
Group SCAD	1.383	7.688	6.986	3.844	1.958
Group SELO	2.47	7.784	6.546	3.892	1.732
Group SICA	2.804	7.676	7.936	3.838	2.318


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.077	7.986	2.322	4	0.184
Group LASSO	3.862	7.4	12.28	3.7	4.144
Group ALASSO	2.376	7.992	3.524	3.996	0.534
Group MCP	1.528	7.728	5.828	3.864	1.51
Group SCAD	1.383	7.688	6.986	3.844	1.958
Group SELO	2.47	7.784	6.546	3.892	1.732
Group SICA	2.804	7.676	7.936	3.838	2.318


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.109	7.99	2.436	3.996	0.238
Group LASSO	3.994	7.216	11.646	3.608	3.918
Group ALASSO	2.531	7.964	3.738	3.982	0.624
Group MCP	1.72	7.568	5.994	3.784	1.62
Group SCAD	1.667	7.528	7.468	3.764	2.192
Group SELO	2.72	7.644	6.312	3.822	1.672
Group SICA	2.806	7.604	8.304	3.802	2.478


Penalty function	RMSE	TP individual	FP individual	TP group	FP group
Group BAR	1.109	7.99	2.436	3.996	0.238
Group LASSO	3.994	7.216	11.646	3.608	3.918
Group ALASSO	2.531	7.964	3.738	3.982	0.624
Group MCP	1.72	7.568	5.994	3.784	1.62
Group SCAD	1.667	7.528	7.468	3.764	2.192
Group SELO	2.72	7.644	6.312	3.822	1.672
Group SICA	2.806	7.604	8.304	3.802	2.478

6 An Application

Now we apply the group variable selection procedure proposed in the previous sections to the ADNI, which is an ongoing, prospective, longitudinal multicenter study designed to investigate clinical, imaging, genetic, and biochemical biomarkers for the early detection of AD and tracking its progression. In the study, the participants were examined intermittently and among others, their cognitive conditions, including cognitively normal (CN), mild cognitive impairment (MCI), and AD, were recorded. Also, the study subjects are divided into three groups based on their baseline cognitive conditions, the CN, MCI, and AD groups, and, among others, one variable of interest is the time from the baseline visit date to the AD conversion date, the failure time of interest here. Due to the nature of the study, only interval-censored data are available for the AD conversion time.

For the analysis below, by following Li et al. (2020), we will focus on the 310 participants in the MC group with complete information on 24 covariates or risk factors. Among them, on the AD conversion, there are 19 left- and 173 right-censored observations, respectively, and the remaining observations are interval-censored. The 24 risk factors are Gender (1 for male and 0 for female), Marital status (1 for married and 0 for otherwise), baseline Age, years of receiving education (PTEDUCAT), mini-mental state examination score (MMSE), apolipoprotein E ε4 (APOEε4), Alzheimer's Disease Assessment Scales scores of 11 and 13 items (ADAS11 and ADAS13), delayed word recall score in ADAS (ADASQ4), Rey auditory verbal learning test score of immediate recall (RAVLT.i), learning ability (RAVLT.l), the total number of words that were forgotten in the RAVLT delayed memory test (RAVLT.f), the percentage of words that were forgotten in the RAVLT delayed memory test (RAVLT.perc.f), the participant's digit symbol substitution test score (DIGITSCOR), trails B score (TRABSCOR), clinical dementia rating scale-sum of boxes score (CDRSB), functional assessment questionnaire score (FAQ), different types of volumetric data including Hippocampus, Entorhinal, fusiform gyrus (Fusiform), middle temporal gyrus (MidTemp), whole brain (WholeBrain), Ventricles, and intracerebral volume (ICV). In the analysis, we regarded the gender and marital status as the discrete covariates and others as continuous covariates, and the continuous covariates were normalized.

It is apparent that among the 24 risk factors, there exist some grouping structures such as the groups 1, 5, and 6 defined below, and this suggests that it is more appropriate to perform group analysis or the analysis that takes into account the grouping structure than the individual analysis as did in the literature. To apply the proposed group approach, we assigned the 24 risk factors into 10 groups based on the literature and the meanings of the factors. The first group includes the Gender and Marital status related to the lifestyle of the subject, and the second group has only one factor Age. The third group consists of PTEDUCAT and MMSE concerning the maturity of patients, and the fourth group also has only one factor APOEε4. The fifth group includes ADAS11, ADAS13, and ADASQ4, the ADAS group, and the sixth group includes RAVLT.i, RAVLT.l, RAVLT.f, and RAVLT.perc.f, the group on the Rey auditory verbal learning test score. The seventh group consists of DIGITSCOR and TRABSCOR, indicating the ability of a subject in terms of digits identification, and the eighth group has the risk factors CDRSB and FAQ, giving a general summary about the patient's disease condition. The ninth group includes Hippocampus, Entorhinal, Fusiform, and MidTemp, concerning some specific functions like recognition or feeling, and the last group includes the last three risk factors, WholeBrain, Ventricles and IC, related to some specific brain functions.

Table 7 presents the results of the group covariate selection given by the proposed sieve penalized maximum likelihood procedure and for each selected risk factor, the results include the estimated effect along with the estimated standard error determined by the simple bootstrap procedure based on 100 bootstrap samples. For comparison, we also obtained and include in the table the results, referred to as individual BAR, by using the individual variable selection method given in Zhao et al. (2020) based on the BAR penalty function. One can see from the table that on the risk factor selection, the proposed method based on all penalty functions basically yielded the same results except that the method with the use of LASSO and ALASSO penalties selected more groups and factors as expected. On the other hand, it seems that these additional groups or factors selected did not have any effect on the AD conversion. On the comparison of the results given by the group and individual variable selection methods, as expected, the latter selected a smaller number of risk factors since it treats all risk factors independently. In other words, the results indicate that in the presence of group structures, the group variable selection can clearly give more reasonable results.

Table 7

Selected factors and estimated covariate effects for the ANDI study based on 10 groups.

Risk factors	Groups	BAR	Group BAR	Group LASSO	Group ALASSO	Group MCP	Group SCAD	Group SELO	Group SICA
Gender	1	-	-	-	-	-	-	-	-
MaritalStatus	1	-	-	-	-	-	-	-	-
Age	2	−0.259(0.191)	−0.295(0.205)	−0.247(0.117)	−0.218(0.152)	−0.325(0.211)	−0.325(0.205)	−0.307(0.206)	−0.304(0.21)
PTEDUCAT	3	-	-	0.011(0.076)	-	-	-	-	-
MMSE	3	-	-	-0.07(0.093)	-	-	-	-	-
APOEε4	4	0.245(0.162)	0.236(0.164)	0.21(0.088)	0.178(0.117)	0.257(0.141)	0.268(0.142)	0.248(0.148)	0.245(0.15)
ADAS11	5	-	-	0.068(0.065)	0.06(0.236)	-	-	-	-
ADAS13	5	0.208(0.201)	-	0.085(0.054)	0.157(0.308)	-	-	-	-
ADASQ4	5	-	-	0.052(0.061)	0.048(0.138)	-	-	-	-
RAVLT.i	6	−0.603(0.184)	−0.639(0.182)	−0.459(0.115)	−0.511(0.158)	−0.656(0.27)	−0.661(0.226)	−0.647(0.182)	−0.645(0.183)
RAVLT.l	6	-	0.207(0.166)	0.119(0.098)	0.189(0.152)	0.232(0.162)	0.229(0.162)	0.226(0.178)	0.225(0.178)
RAVLT.f	6	-	−0.208(0.262)	−0.111(0.099)	−0.195(0.207)	−0.224(0.219)	−0.217(0.236)	−0.225(0.288)	−0.226(0.285)
RAVLT.perc.f	6	-	0.303(0.286)	0.201(0.095)	0.29(0.214)	0.306(0.247)	0.297(0.252)	0.315(0.3)	0.317(0.3)
DIGITSCOR	7	-	-	-0.066(0.098)	-	-	-	-	-
TRABSCOR	7	-	-	0.041(0.064)	-	-	-	-	-
CDRSB	8	-	0.138(0.130)	0.104(0.081)	0.12(0.094)	0.137(0.123)	0.137(0.127)	0.139(0.139)	0.139(0.134)
FAQ	8	0.298(0.158)	0.232(0.144)	0.198(0.1)	0.183(0.131)	0.248(0.156)	0.248(0.155)	0.24(0.167)	0.238(0.169)
Hippocampus	9	-	−0.194(0.177)	−0.129(0.092)	−0.125(0.137)	−0.218(0.179)	−0.215(0.179)	−0.209(0.18)	−0.207(0.182)
Entorhinal	9	−0.268(0.235)	−0.259(0.176)	−0.192(0.108)	−0.197(0.147)	−0.261(0.161)	−0.26(0.168)	−0.26(0.175)	−0.26(0.174)
Fusiform	9	-	−0.056(0.163)	−0.056(0.095)	−0.074(0.135)	−0.072(0.17)	−0.073(0.168)	−0.065(0.179)	−0.062(0.168)
MidTemp	9	−0.625(0.343)	−0.559(0.225)	−0.381(0.138)	−0.478(0.191)	−0.591(0.257)	−0.591(0.258)	−0.584(0.229)	−0.582(0.237)
WholeBrain	10	-	0.126(0.246)	0.044(0.058)	0.088(0.134)	0.137(0.185)	0.124(0.19)	0.139(0.236)	0.138(0.224)
Ventricles	10	-	0.047(0.117)	0.063(0.055)	0.065(0.082)	0.021(0.083)	0.018(0.095)	0.031(0.106)	0.033(0.095)
ICV	10	0.308(0.234)	0.178(0.211)	0.111(0.078)	0.138(0.134)	0.24(0.195)	0.253(0.209)	0.213(0.229)	0.206(0.22)

Risk factors	Groups	BAR	Group BAR	Group LASSO	Group ALASSO	Group MCP	Group SCAD	Group SELO	Group SICA
Gender	1	-	-	-	-	-	-	-	-
MaritalStatus	1	-	-	-	-	-	-	-	-
Age	2	−0.259(0.191)	−0.295(0.205)	−0.247(0.117)	−0.218(0.152)	−0.325(0.211)	−0.325(0.205)	−0.307(0.206)	−0.304(0.21)
PTEDUCAT	3	-	-	0.011(0.076)	-	-	-	-	-
MMSE	3	-	-	-0.07(0.093)	-	-	-	-	-
APOEε4	4	0.245(0.162)	0.236(0.164)	0.21(0.088)	0.178(0.117)	0.257(0.141)	0.268(0.142)	0.248(0.148)	0.245(0.15)
ADAS11	5	-	-	0.068(0.065)	0.06(0.236)	-	-	-	-
ADAS13	5	0.208(0.201)	-	0.085(0.054)	0.157(0.308)	-	-	-	-
ADASQ4	5	-	-	0.052(0.061)	0.048(0.138)	-	-	-	-
RAVLT.i	6	−0.603(0.184)	−0.639(0.182)	−0.459(0.115)	−0.511(0.158)	−0.656(0.27)	−0.661(0.226)	−0.647(0.182)	−0.645(0.183)
RAVLT.l	6	-	0.207(0.166)	0.119(0.098)	0.189(0.152)	0.232(0.162)	0.229(0.162)	0.226(0.178)	0.225(0.178)
RAVLT.f	6	-	−0.208(0.262)	−0.111(0.099)	−0.195(0.207)	−0.224(0.219)	−0.217(0.236)	−0.225(0.288)	−0.226(0.285)
RAVLT.perc.f	6	-	0.303(0.286)	0.201(0.095)	0.29(0.214)	0.306(0.247)	0.297(0.252)	0.315(0.3)	0.317(0.3)
DIGITSCOR	7	-	-	-0.066(0.098)	-	-	-	-	-
TRABSCOR	7	-	-	0.041(0.064)	-	-	-	-	-
CDRSB	8	-	0.138(0.130)	0.104(0.081)	0.12(0.094)	0.137(0.123)	0.137(0.127)	0.139(0.139)	0.139(0.134)
FAQ	8	0.298(0.158)	0.232(0.144)	0.198(0.1)	0.183(0.131)	0.248(0.156)	0.248(0.155)	0.24(0.167)	0.238(0.169)
Hippocampus	9	-	−0.194(0.177)	−0.129(0.092)	−0.125(0.137)	−0.218(0.179)	−0.215(0.179)	−0.209(0.18)	−0.207(0.182)
Entorhinal	9	−0.268(0.235)	−0.259(0.176)	−0.192(0.108)	−0.197(0.147)	−0.261(0.161)	−0.26(0.168)	−0.26(0.175)	−0.26(0.174)
Fusiform	9	-	−0.056(0.163)	−0.056(0.095)	−0.074(0.135)	−0.072(0.17)	−0.073(0.168)	−0.065(0.179)	−0.062(0.168)
MidTemp	9	−0.625(0.343)	−0.559(0.225)	−0.381(0.138)	−0.478(0.191)	−0.591(0.257)	−0.591(0.258)	−0.584(0.229)	−0.582(0.237)
WholeBrain	10	-	0.126(0.246)	0.044(0.058)	0.088(0.134)	0.137(0.185)	0.124(0.19)	0.139(0.236)	0.138(0.224)
Ventricles	10	-	0.047(0.117)	0.063(0.055)	0.065(0.082)	0.021(0.083)	0.018(0.095)	0.031(0.106)	0.033(0.095)
ICV	10	0.308(0.234)	0.178(0.211)	0.111(0.078)	0.138(0.134)	0.24(0.195)	0.253(0.209)	0.213(0.229)	0.206(0.22)

Table 7

Selected factors and estimated covariate effects for the ANDI study based on 10 groups.

Risk factors	Groups	BAR	Group BAR	Group LASSO	Group ALASSO	Group MCP	Group SCAD	Group SELO	Group SICA
Gender	1	-	-	-	-	-	-	-	-
MaritalStatus	1	-	-	-	-	-	-	-	-
Age	2	−0.259(0.191)	−0.295(0.205)	−0.247(0.117)	−0.218(0.152)	−0.325(0.211)	−0.325(0.205)	−0.307(0.206)	−0.304(0.21)
PTEDUCAT	3	-	-	0.011(0.076)	-	-	-	-	-
MMSE	3	-	-	-0.07(0.093)	-	-	-	-	-
APOEε4	4	0.245(0.162)	0.236(0.164)	0.21(0.088)	0.178(0.117)	0.257(0.141)	0.268(0.142)	0.248(0.148)	0.245(0.15)
ADAS11	5	-	-	0.068(0.065)	0.06(0.236)	-	-	-	-
ADAS13	5	0.208(0.201)	-	0.085(0.054)	0.157(0.308)	-	-	-	-
ADASQ4	5	-	-	0.052(0.061)	0.048(0.138)	-	-	-	-
RAVLT.i	6	−0.603(0.184)	−0.639(0.182)	−0.459(0.115)	−0.511(0.158)	−0.656(0.27)	−0.661(0.226)	−0.647(0.182)	−0.645(0.183)
RAVLT.l	6	-	0.207(0.166)	0.119(0.098)	0.189(0.152)	0.232(0.162)	0.229(0.162)	0.226(0.178)	0.225(0.178)
RAVLT.f	6	-	−0.208(0.262)	−0.111(0.099)	−0.195(0.207)	−0.224(0.219)	−0.217(0.236)	−0.225(0.288)	−0.226(0.285)
RAVLT.perc.f	6	-	0.303(0.286)	0.201(0.095)	0.29(0.214)	0.306(0.247)	0.297(0.252)	0.315(0.3)	0.317(0.3)
DIGITSCOR	7	-	-	-0.066(0.098)	-	-	-	-	-
TRABSCOR	7	-	-	0.041(0.064)	-	-	-	-	-
CDRSB	8	-	0.138(0.130)	0.104(0.081)	0.12(0.094)	0.137(0.123)	0.137(0.127)	0.139(0.139)	0.139(0.134)
FAQ	8	0.298(0.158)	0.232(0.144)	0.198(0.1)	0.183(0.131)	0.248(0.156)	0.248(0.155)	0.24(0.167)	0.238(0.169)
Hippocampus	9	-	−0.194(0.177)	−0.129(0.092)	−0.125(0.137)	−0.218(0.179)	−0.215(0.179)	−0.209(0.18)	−0.207(0.182)
Entorhinal	9	−0.268(0.235)	−0.259(0.176)	−0.192(0.108)	−0.197(0.147)	−0.261(0.161)	−0.26(0.168)	−0.26(0.175)	−0.26(0.174)
Fusiform	9	-	−0.056(0.163)	−0.056(0.095)	−0.074(0.135)	−0.072(0.17)	−0.073(0.168)	−0.065(0.179)	−0.062(0.168)
MidTemp	9	−0.625(0.343)	−0.559(0.225)	−0.381(0.138)	−0.478(0.191)	−0.591(0.257)	−0.591(0.258)	−0.584(0.229)	−0.582(0.237)
WholeBrain	10	-	0.126(0.246)	0.044(0.058)	0.088(0.134)	0.137(0.185)	0.124(0.19)	0.139(0.236)	0.138(0.224)
Ventricles	10	-	0.047(0.117)	0.063(0.055)	0.065(0.082)	0.021(0.083)	0.018(0.095)	0.031(0.106)	0.033(0.095)
ICV	10	0.308(0.234)	0.178(0.211)	0.111(0.078)	0.138(0.134)	0.24(0.195)	0.253(0.209)	0.213(0.229)	0.206(0.22)

Risk factors	Groups	BAR	Group BAR	Group LASSO	Group ALASSO	Group MCP	Group SCAD	Group SELO	Group SICA
Gender	1	-	-	-	-	-	-	-	-
MaritalStatus	1	-	-	-	-	-	-	-	-
Age	2	−0.259(0.191)	−0.295(0.205)	−0.247(0.117)	−0.218(0.152)	−0.325(0.211)	−0.325(0.205)	−0.307(0.206)	−0.304(0.21)
PTEDUCAT	3	-	-	0.011(0.076)	-	-	-	-	-
MMSE	3	-	-	-0.07(0.093)	-	-	-	-	-
APOEε4	4	0.245(0.162)	0.236(0.164)	0.21(0.088)	0.178(0.117)	0.257(0.141)	0.268(0.142)	0.248(0.148)	0.245(0.15)
ADAS11	5	-	-	0.068(0.065)	0.06(0.236)	-	-	-	-
ADAS13	5	0.208(0.201)	-	0.085(0.054)	0.157(0.308)	-	-	-	-
ADASQ4	5	-	-	0.052(0.061)	0.048(0.138)	-	-	-	-
RAVLT.i	6	−0.603(0.184)	−0.639(0.182)	−0.459(0.115)	−0.511(0.158)	−0.656(0.27)	−0.661(0.226)	−0.647(0.182)	−0.645(0.183)
RAVLT.l	6	-	0.207(0.166)	0.119(0.098)	0.189(0.152)	0.232(0.162)	0.229(0.162)	0.226(0.178)	0.225(0.178)
RAVLT.f	6	-	−0.208(0.262)	−0.111(0.099)	−0.195(0.207)	−0.224(0.219)	−0.217(0.236)	−0.225(0.288)	−0.226(0.285)
RAVLT.perc.f	6	-	0.303(0.286)	0.201(0.095)	0.29(0.214)	0.306(0.247)	0.297(0.252)	0.315(0.3)	0.317(0.3)
DIGITSCOR	7	-	-	-0.066(0.098)	-	-	-	-	-
TRABSCOR	7	-	-	0.041(0.064)	-	-	-	-	-
CDRSB	8	-	0.138(0.130)	0.104(0.081)	0.12(0.094)	0.137(0.123)	0.137(0.127)	0.139(0.139)	0.139(0.134)
FAQ	8	0.298(0.158)	0.232(0.144)	0.198(0.1)	0.183(0.131)	0.248(0.156)	0.248(0.155)	0.24(0.167)	0.238(0.169)
Hippocampus	9	-	−0.194(0.177)	−0.129(0.092)	−0.125(0.137)	−0.218(0.179)	−0.215(0.179)	−0.209(0.18)	−0.207(0.182)
Entorhinal	9	−0.268(0.235)	−0.259(0.176)	−0.192(0.108)	−0.197(0.147)	−0.261(0.161)	−0.26(0.168)	−0.26(0.175)	−0.26(0.174)
Fusiform	9	-	−0.056(0.163)	−0.056(0.095)	−0.074(0.135)	−0.072(0.17)	−0.073(0.168)	−0.065(0.179)	−0.062(0.168)
MidTemp	9	−0.625(0.343)	−0.559(0.225)	−0.381(0.138)	−0.478(0.191)	−0.591(0.257)	−0.591(0.258)	−0.584(0.229)	−0.582(0.237)
WholeBrain	10	-	0.126(0.246)	0.044(0.058)	0.088(0.134)	0.137(0.185)	0.124(0.19)	0.139(0.236)	0.138(0.224)
Ventricles	10	-	0.047(0.117)	0.063(0.055)	0.065(0.082)	0.021(0.083)	0.018(0.095)	0.031(0.106)	0.033(0.095)
ICV	10	0.308(0.234)	0.178(0.211)	0.111(0.078)	0.138(0.134)	0.24(0.195)	0.253(0.209)	0.213(0.229)	0.206(0.22)

To see the possible grouping effect on the results and conclusions, we also considered a few other groupings. For example, for the results given in Table 8, except the groups 1, 5, and 6 defined above, we grouped other risk factors into two groups based on the individual variable selection results. More specifically, we put all important factors into one group and the remaining into the other group. One can see from Table 8 that although there are some differences as expected, overall the results are consistent with those given in Table 7, especially on the important factors or groups. The same is true for other groupings considered, and this suggests that the proposed group variable selection procedure is valid and works well.

Table 8

Selected factors and estimated covariate effects for the ANDI study based on five groups.

Risk factors	Groups	BAR	Group BAR	Group LASSO	Group ALASSO	Group MCP	Group SCAD	Group SELO	Group SICA
Gender	1	-	-	-	-	-	-	-	-
MaritalStatus	1	-	-	-	-	-	-	-	-
Age	2	−0.259(0.191)	−0.343(0.085)	−0.24(0.085)	−0.295(0.115)	−0.3(0.118)	−0.3(0.117)	−0.352(0.125)	−0.352(0.124)
PTEDUCAT	3	-	-0.025(0.056)	0.008(0.056)	−0.005(0.087)	-	-	−0.031(0.108)	−0.031(0.107)
MMSE	3	-	−0.149(0.07)	−0.071(0.07)	−0.091(0.115)	-	-	-0.159(0.142)	−0.16(0.145)
APOEε4	2	0.245(0.162)	0.262(0.083)	0.223(0.083)	0.252(0.098)	0.277(0.102)	0.277(0.101)	0.262(0.107)	0.262(0.106)
ADAS11	4	-	-	0.068(0.065)	0.046(0.279)	-	-	-	-
ADAS13	4	0.208(0.201)	-	0.086(0.059)	0.097(0.4)	-	-	-	-
ADASQ4	4	-	-	0.054(0.053)	0.042(0.144)	-	-	-	-
RAVLT.i	5	−0.603(0.184)	−0.6(0.109)	−0.444(0.109)	−0.526(0.15)	−0.619(0.232)	−0.619(0.206)	−0.602(0.174)	−0.602(0.17)
RAVLT.l	5	-	0.234(0.087)	0.113(0.087)	0.206(0.134)	0.206(0.133)	0.207(0.13)	0.254(0.151)	0.255(0.153)
RAVLT.f	5	-	−0.199(0.076)	−0.106(0.076)	−0.172(0.16)	−0.201(0.196)	−0.201(0.201)	−0.221(0.228)	−0.223(0.229)
RAVLT.perc.f	5	-	0.302(0.097)	0.201(0.097)	0.267(0.199)	0.32(0.238)	0.32(0.238)	0.32(0.274)	0.321(0.273)
DIGITSCOR	3	-	−0.128(0.068)	−0.064(0.068)	−0.067(0.126)	-	-	−0.142(0.165)	−0.143(0.163)
TRABSCOR	3	-	0.005(0.057)	0.039(0.057)	0.02(0.099)	-	-	−0.008(0.121)	−0.009(0.121)
CDRSB	3	-	0.077(0.055)	0.058(0.055)	0.061(0.101)	-	-	0.078(0.135)	0.078(0.133)
FAQ	2	0.298(0.158)	0.251(0.093)	0.237(0.093)	0.25(0.122)	0.331(0.101)	0.331(0.105)	0.25(0.135)	0.25(0.136)
Hippocampus	3	-	−0.193(0.067)	−0.065(0.067)	−0.088(0.134)	-	-	−0.224(0.185)	−0.227(0.185)
Entorhinal	2	−0.268(0.235)	−0.275(0.123)	−0.226(0.123)	−0.252(0.179)	−0.352(0.175)	−0.352(0.179)	−0.267(0.202)	−0.266(0.201)
Fusiform	3	-	−0.018(0.076)	−0.035(0.076)	−0.031(0.156)	-	-	−0.018(0.198)	−0.018(0.197)
MidTemp	2	−0.625(0.343)	−0.537(0.127)	−0.433(0.127)	−0.515(0.182)	−0.652(0.158)	−0.652(0.164)	−0.544(0.215)	−0.544(0.21)
WholeBrain	3	-	0.014(0.044)	−0.001(0.044)	−0.006(0.153)	-	-	0.024(0.306)	0.025(0.304)
Ventricles	3	-	−0.00007(0.051)	0.04(0.051)	0.031(0.103)	-	-	−0.009(0.149)	−0.009(0.15)
ICV	2	0.308(0.234)	0.32(0.092)	0.199(0.092)	0.278(0.172)	0.326(0.126)	0.326(0.134)	0.331(0.277)	0.331(0.275)

Risk factors	Groups	BAR	Group BAR	Group LASSO	Group ALASSO	Group MCP	Group SCAD	Group SELO	Group SICA
Gender	1	-	-	-	-	-	-	-	-
MaritalStatus	1	-	-	-	-	-	-	-	-
Age	2	−0.259(0.191)	−0.343(0.085)	−0.24(0.085)	−0.295(0.115)	−0.3(0.118)	−0.3(0.117)	−0.352(0.125)	−0.352(0.124)
PTEDUCAT	3	-	-0.025(0.056)	0.008(0.056)	−0.005(0.087)	-	-	−0.031(0.108)	−0.031(0.107)
MMSE	3	-	−0.149(0.07)	−0.071(0.07)	−0.091(0.115)	-	-	-0.159(0.142)	−0.16(0.145)
APOEε4	2	0.245(0.162)	0.262(0.083)	0.223(0.083)	0.252(0.098)	0.277(0.102)	0.277(0.101)	0.262(0.107)	0.262(0.106)
ADAS11	4	-	-	0.068(0.065)	0.046(0.279)	-	-	-	-
ADAS13	4	0.208(0.201)	-	0.086(0.059)	0.097(0.4)	-	-	-	-
ADASQ4	4	-	-	0.054(0.053)	0.042(0.144)	-	-	-	-
RAVLT.i	5	−0.603(0.184)	−0.6(0.109)	−0.444(0.109)	−0.526(0.15)	−0.619(0.232)	−0.619(0.206)	−0.602(0.174)	−0.602(0.17)
RAVLT.l	5	-	0.234(0.087)	0.113(0.087)	0.206(0.134)	0.206(0.133)	0.207(0.13)	0.254(0.151)	0.255(0.153)
RAVLT.f	5	-	−0.199(0.076)	−0.106(0.076)	−0.172(0.16)	−0.201(0.196)	−0.201(0.201)	−0.221(0.228)	−0.223(0.229)
RAVLT.perc.f	5	-	0.302(0.097)	0.201(0.097)	0.267(0.199)	0.32(0.238)	0.32(0.238)	0.32(0.274)	0.321(0.273)
DIGITSCOR	3	-	−0.128(0.068)	−0.064(0.068)	−0.067(0.126)	-	-	−0.142(0.165)	−0.143(0.163)
TRABSCOR	3	-	0.005(0.057)	0.039(0.057)	0.02(0.099)	-	-	−0.008(0.121)	−0.009(0.121)
CDRSB	3	-	0.077(0.055)	0.058(0.055)	0.061(0.101)	-	-	0.078(0.135)	0.078(0.133)
FAQ	2	0.298(0.158)	0.251(0.093)	0.237(0.093)	0.25(0.122)	0.331(0.101)	0.331(0.105)	0.25(0.135)	0.25(0.136)
Hippocampus	3	-	−0.193(0.067)	−0.065(0.067)	−0.088(0.134)	-	-	−0.224(0.185)	−0.227(0.185)
Entorhinal	2	−0.268(0.235)	−0.275(0.123)	−0.226(0.123)	−0.252(0.179)	−0.352(0.175)	−0.352(0.179)	−0.267(0.202)	−0.266(0.201)
Fusiform	3	-	−0.018(0.076)	−0.035(0.076)	−0.031(0.156)	-	-	−0.018(0.198)	−0.018(0.197)
MidTemp	2	−0.625(0.343)	−0.537(0.127)	−0.433(0.127)	−0.515(0.182)	−0.652(0.158)	−0.652(0.164)	−0.544(0.215)	−0.544(0.21)
WholeBrain	3	-	0.014(0.044)	−0.001(0.044)	−0.006(0.153)	-	-	0.024(0.306)	0.025(0.304)
Ventricles	3	-	−0.00007(0.051)	0.04(0.051)	0.031(0.103)	-	-	−0.009(0.149)	−0.009(0.15)
ICV	2	0.308(0.234)	0.32(0.092)	0.199(0.092)	0.278(0.172)	0.326(0.126)	0.326(0.134)	0.331(0.277)	0.331(0.275)

Table 8

https://adni.loni.usc.edu/data-samples/access-data/.

Selected factors and estimated covariate effects for the ANDI study based on five groups.

Risk factors	Groups	BAR	Group BAR	Group LASSO	Group ALASSO	Group MCP	Group SCAD	Group SELO	Group SICA
Gender	1	-	-	-	-	-	-	-	-
MaritalStatus	1	-	-	-	-	-	-	-	-
Age	2	−0.259(0.191)	−0.343(0.085)	−0.24(0.085)	−0.295(0.115)	−0.3(0.118)	−0.3(0.117)	−0.352(0.125)	−0.352(0.124)
PTEDUCAT	3	-	-0.025(0.056)	0.008(0.056)	−0.005(0.087)	-	-	−0.031(0.108)	−0.031(0.107)
MMSE	3	-	−0.149(0.07)	−0.071(0.07)	−0.091(0.115)	-	-	-0.159(0.142)	−0.16(0.145)
APOEε4	2	0.245(0.162)	0.262(0.083)	0.223(0.083)	0.252(0.098)	0.277(0.102)	0.277(0.101)	0.262(0.107)	0.262(0.106)
ADAS11	4	-	-	0.068(0.065)	0.046(0.279)	-	-	-	-
ADAS13	4	0.208(0.201)	-	0.086(0.059)	0.097(0.4)	-	-	-	-
ADASQ4	4	-	-	0.054(0.053)	0.042(0.144)	-	-	-	-
RAVLT.i	5	−0.603(0.184)	−0.6(0.109)	−0.444(0.109)	−0.526(0.15)	−0.619(0.232)	−0.619(0.206)	−0.602(0.174)	−0.602(0.17)
RAVLT.l	5	-	0.234(0.087)	0.113(0.087)	0.206(0.134)	0.206(0.133)	0.207(0.13)	0.254(0.151)	0.255(0.153)
RAVLT.f	5	-	−0.199(0.076)	−0.106(0.076)	−0.172(0.16)	−0.201(0.196)	−0.201(0.201)	−0.221(0.228)	−0.223(0.229)
RAVLT.perc.f	5	-	0.302(0.097)	0.201(0.097)	0.267(0.199)	0.32(0.238)	0.32(0.238)	0.32(0.274)	0.321(0.273)
DIGITSCOR	3	-	−0.128(0.068)	−0.064(0.068)	−0.067(0.126)	-	-	−0.142(0.165)	−0.143(0.163)
TRABSCOR	3	-	0.005(0.057)	0.039(0.057)	0.02(0.099)	-	-	−0.008(0.121)	−0.009(0.121)
CDRSB	3	-	0.077(0.055)	0.058(0.055)	0.061(0.101)	-	-	0.078(0.135)	0.078(0.133)
FAQ	2	0.298(0.158)	0.251(0.093)	0.237(0.093)	0.25(0.122)	0.331(0.101)	0.331(0.105)	0.25(0.135)	0.25(0.136)
Hippocampus	3	-	−0.193(0.067)	−0.065(0.067)	−0.088(0.134)	-	-	−0.224(0.185)	−0.227(0.185)
Entorhinal	2	−0.268(0.235)	−0.275(0.123)	−0.226(0.123)	−0.252(0.179)	−0.352(0.175)	−0.352(0.179)	−0.267(0.202)	−0.266(0.201)
Fusiform	3	-	−0.018(0.076)	−0.035(0.076)	−0.031(0.156)	-	-	−0.018(0.198)	−0.018(0.197)
MidTemp	2	−0.625(0.343)	−0.537(0.127)	−0.433(0.127)	−0.515(0.182)	−0.652(0.158)	−0.652(0.164)	−0.544(0.215)	−0.544(0.21)
WholeBrain	3	-	0.014(0.044)	−0.001(0.044)	−0.006(0.153)	-	-	0.024(0.306)	0.025(0.304)
Ventricles	3	-	−0.00007(0.051)	0.04(0.051)	0.031(0.103)	-	-	−0.009(0.149)	−0.009(0.15)
ICV	2	0.308(0.234)	0.32(0.092)	0.199(0.092)	0.278(0.172)	0.326(0.126)	0.326(0.134)	0.331(0.277)	0.331(0.275)

Risk factors	Groups	BAR	Group BAR	Group LASSO	Group ALASSO	Group MCP	Group SCAD	Group SELO	Group SICA
Gender	1	-	-	-	-	-	-	-	-
MaritalStatus	1	-	-	-	-	-	-	-	-
Age	2	−0.259(0.191)	−0.343(0.085)	−0.24(0.085)	−0.295(0.115)	−0.3(0.118)	−0.3(0.117)	−0.352(0.125)	−0.352(0.124)
PTEDUCAT	3	-	-0.025(0.056)	0.008(0.056)	−0.005(0.087)	-	-	−0.031(0.108)	−0.031(0.107)
MMSE	3	-	−0.149(0.07)	−0.071(0.07)	−0.091(0.115)	-	-	-0.159(0.142)	−0.16(0.145)
APOEε4	2	0.245(0.162)	0.262(0.083)	0.223(0.083)	0.252(0.098)	0.277(0.102)	0.277(0.101)	0.262(0.107)	0.262(0.106)
ADAS11	4	-	-	0.068(0.065)	0.046(0.279)	-	-	-	-
ADAS13	4	0.208(0.201)	-	0.086(0.059)	0.097(0.4)	-	-	-	-
ADASQ4	4	-	-	0.054(0.053)	0.042(0.144)	-	-	-	-
RAVLT.i	5	−0.603(0.184)	−0.6(0.109)	−0.444(0.109)	−0.526(0.15)	−0.619(0.232)	−0.619(0.206)	−0.602(0.174)	−0.602(0.17)
RAVLT.l	5	-	0.234(0.087)	0.113(0.087)	0.206(0.134)	0.206(0.133)	0.207(0.13)	0.254(0.151)	0.255(0.153)
RAVLT.f	5	-	−0.199(0.076)	−0.106(0.076)	−0.172(0.16)	−0.201(0.196)	−0.201(0.201)	−0.221(0.228)	−0.223(0.229)
RAVLT.perc.f	5	-	0.302(0.097)	0.201(0.097)	0.267(0.199)	0.32(0.238)	0.32(0.238)	0.32(0.274)	0.321(0.273)
DIGITSCOR	3	-	−0.128(0.068)	−0.064(0.068)	−0.067(0.126)	-	-	−0.142(0.165)	−0.143(0.163)
TRABSCOR	3	-	0.005(0.057)	0.039(0.057)	0.02(0.099)	-	-	−0.008(0.121)	−0.009(0.121)
CDRSB	3	-	0.077(0.055)	0.058(0.055)	0.061(0.101)	-	-	0.078(0.135)	0.078(0.133)
FAQ	2	0.298(0.158)	0.251(0.093)	0.237(0.093)	0.25(0.122)	0.331(0.101)	0.331(0.105)	0.25(0.135)	0.25(0.136)
Hippocampus	3	-	−0.193(0.067)	−0.065(0.067)	−0.088(0.134)	-	-	−0.224(0.185)	−0.227(0.185)
Entorhinal	2	−0.268(0.235)	−0.275(0.123)	−0.226(0.123)	−0.252(0.179)	−0.352(0.175)	−0.352(0.179)	−0.267(0.202)	−0.266(0.201)
Fusiform	3	-	−0.018(0.076)	−0.035(0.076)	−0.031(0.156)	-	-	−0.018(0.198)	−0.018(0.197)
MidTemp	2	−0.625(0.343)	−0.537(0.127)	−0.433(0.127)	−0.515(0.182)	−0.652(0.158)	−0.652(0.164)	−0.544(0.215)	−0.544(0.21)
WholeBrain	3	-	0.014(0.044)	−0.001(0.044)	−0.006(0.153)	-	-	0.024(0.306)	0.025(0.304)
Ventricles	3	-	−0.00007(0.051)	0.04(0.051)	0.031(0.103)	-	-	−0.009(0.149)	−0.009(0.15)
ICV	2	0.308(0.234)	0.32(0.092)	0.199(0.092)	0.278(0.172)	0.326(0.126)	0.326(0.134)	0.331(0.277)	0.331(0.275)

7 Discussion and Concluding Remarks

In the paper, we discussed the group variable selection when one faces interval-censored data, a general type of incomplete or failure time data. For the problem, a sieve-penalized maximum likelihood procedure was developed under the Cox or proportional hazards model and the proposed method can simultaneously select active or important groups and estimate covariate effects. The method allows for the use of any penalty function although only the oracle property with the use of the BAR penalty was established, and it can be regarded as a generalization of the method given in Zhao et al. (2020) for individual variable selection. An extensive simulation study was carried out and indicates that the proposed procedure works well for practical situations. An application to an AD study was provided.

Note that in the proposed method, Bernstein polynomials were used to approximate the unknown cumulative baseline hazard function in order to simplify the involved optimization problem. As mentioned above, one may instead use other approximations such as piecewise constant functions or some spline functions. The main reason that Bernstein polynomials were chosen is that they have some nice properties, including continuity and differentiability, that result in a simpler estimation procedure. In the above, for the selection of the tuning parameter, we suggested to use BIC and it is apparent that one may apply other criteria such as C-fold cross-validation or generalized C-fold cross-validation. However, they tend to be conservative for the group selection or to select many unimportant groups. Also, the algorithm based on the BIC is more efficient because it does not need to partition the dataset into different parts.

It is worth to point out that in the preceding sections, the focus has been on the Cox or proportional hazards model, and it is well known that sometimes it may not fit data well, or one may prefer other models such as the additive hazards model or linear transformation model. Especially, the latter is more flexible and includes the Cox model as a special case. Although the idea discussed above still applies to these situations, a lot of more work is needed to generalize the proposed method to other models. Another assumption behind the proposed method is that we have assumed that the interval censoring is independent or noninformative, meaning that the observation process contains no relevant or useful information about the failure time of interest. It is apparent that this may not hold sometimes and as discussed in the literature (Sun, 2006), in the presence of informative censoring, the analysis that ignores it could lead to biased results.

Data Availability Statement

The data (ADNI, 2004) that support the findings in this paper are available at the website of The Alzheimer's Disease Neuroimaging Initiative (https://adni.loni.usc.edu/data-samples/access-data/).

Acknowledgments

The authors wish to thank the co-editor, the associate editor, and two anonymous reviewers for their many insightful comments and suggestions that greatly improved the paper. The research of Dr. Zhao was partially supported by the National Natural Science Foundation of China (grant number 12171483).

References

ADNI

. (

2004

)

The Alzheimer's Disease Neuroimaging Initiative

.

Available at:

Dai

,

L.

,

Chen

,

K.

,

Sun

,

Z.

,

Liu

,

Z.

&

Li

,

G.

(

2018

)

Broken adaptive ridge regression and its asymptotic properties

.

Journal of Multivariate Analysis

,

168

,

334

–

351

.

Dicker

,

L.

,

Huang

,

B.

&

Lin

,

X.

(

2013

)

Variable selection and estimation with the seamless-L 0 penalty

.

The Annals of Statistics

,

23

(

2

),

929

–

962

.

Fan

,

J.

&

Li

,

R.

(

2001

)

Variable selection via nonconcave penalized likelihood and its oracle property

.

Journal of the American Statistical Association

,

96

(

456

),

1348

–

1360

.

Finkelstein

,

D.M.

(

1986

)

A proportional hazards model for interval-censored failure time data

.

Biometrics

,

42

(

4

),

845

–

854

.

Huang

,

J.

,

Breheny

,

P.

&

Ma

,

S.

(

2012

)

A selective review of group selection in high-dimensional models

.

Statistical Science

,

27

(

4

),

481

–

499

.

Huang

,

J.

,

Liu

,

L.

,

Liu

,

Y.

&

Zhao

,

X.

(

2014

)

Group selection in the Cox model with a divergence number of covariate

.

Statistica Sinica

,

24

(

4

),

1787

–

1810

.

Huang

,

J.

&

Rossini

,

A.J.

(

1997

)

Sieve estimation for the proportional odds failure-time regression model with interval censoring

.

Journal of the American Statistical Association

,

92

(

4

),

960

–

967

.

Jewell

,

N.P.

&

Laan

,

M. V.D.

(

2004

)

Case control current status data

.

Biometrika

,

91

(

3

),

529

–

541

.

Kalbfleisch

,

J.D.

&

Prentice

,

R.L.

(

2002

)

The statistical analysis of failure time data

.

Wiley

.

Kim

,

J.

,

Sohn

,

I.

,

Jung

,

S.-H.

,

Kim

,

S.

&

Park

,

C.

(

2012

)

Analysis of survival data with group lasso

.

Communication in Statistics Simulation and Computation

,

41

(

9

),

1593

–

1605

.

Li

,

S.

,

Wu

,

Q.

&

Sun

,

J.

(

2020

)

Penalized estimation of semiparametric transformation models with interval-censored data and application to Alzheimer's disease

.

Statistical Methods in Medical Research

,

29

(

8

),

2151

–

2166

.

Lv

,

J.

&

Fan

,

Y.

(

2009

)

A unified approach to model selection and sparse recovery using regularized least squares

.

The Annals of Statistics

,

37

(

6A

),

3498

–

3528

.

Sun

,

J.

(

2006

)

The statistical analysis of interval-censored failure time data

.

Springer

.

Google Preview

Tibshirani

,

R.

(

1996

)

Regression shrinkage and selection via the Lasso

.

Journal of the Royal Statistical Society: Series B (Methodological)

,

58

(

1

),

267

–

288

.

Yuan

,

M.

&

Lin

,

Y.

(

2006

)

Model selection and estimation in regression with grouped variables

.

Journal of the Royal Statistical Society. Series B (Statistical Methodology)

,

68

(

1

),

49

–

67

.

Zeng

,

D.

,

Mao

,

L.

&

Lin

,

D.

(

2016

)

Maximum likelihood estimation for semi-parametric transformation models with interval-censored data

.

Biometrika

,

103

(

2

),

253

–

271

.

Zhang

,

Y.

,

Hua

,

L.

&

Huang

,

J.

(

2010

)

A spline-based semiparametric maximum likelihood estimation method for the Cox model with interval- censored data

.

Scandinavian Journal of Statistics

,

37

(

4

),

338

–

354

.

Zhao

,

H.

,

Wu

,

Q.

,

Li

,

G.

&

Sun

,

J.

(

2020

)

Simultaneous estimation and variable selection for interval-censored data with broken adaptive ridge regression

.

Journal of the American Statistical Association

,

115

(

529

),

204

–

216

.

Zhou

,

Q.

,

Hu

,

T.

&

Sun

,

J.

(

2017

)

A sieve semiparametric maximum likelihood approach for regression analysis of bivariate interval-censored failure time data

.

Journal of the American Statistical Association

,

112

(

518

),

664

–

672

.

Zou

,

H.

(

2006

)

The adaptive lasso and its oracle properties

.

Journal of the American Statistical Association

,

101

(

476

),

1418

–

1429

.