Patch-based convolutional neural networks for automatic landmark detection of 3D facial images in clinical settings

Eur J Orthod. 2024 Dec 1;46(6):cjae056. doi: 10.1093/ejo/cjae056.

Abstract

Background: The facial landmark annotation of 3D facial images is crucial in clinical orthodontics and orthognathic surgeries for accurate diagnosis and treatment planning. While manual landmarking has traditionally been the gold standard, it is labour-intensive and prone to variability.

Objective: This study presents a framework for automated landmark detection in 3D facial images within a clinical context, using convolutional neural networks (CNNs), and it assesses its accuracy in comparison to that of ground-truth data.

Material and methods: Initially, an in-house dataset of 408 3D facial images, each annotated with 37 landmarks by an expert, was constructed. Subsequently, a 2.5D patch-based CNN architecture was trained using this dataset to detect the same set of landmarks automatically.

Results: The developed CNN model demonstrated high accuracy, with an overall mean localization error of 0.83 ± 0.49 mm. The majority of the landmarks had low localization errors, with 95% exhibiting a mean error of less than 1 mm across all axes. Moreover, the method achieved a high success detection rate, with 88% of detections having an error below 1.5 mm and 94% below 2 mm.

Conclusion: The automated method used in this study demonstrated accuracy comparable to that achieved with manual annotations within clinical settings. In addition, the proposed framework for automatic landmark localization exhibited improved accuracy over existing models in the literature. Despite these advancements, it is important to acknowledge the limitations of this research, such as that it was based on a single-centre study and a single annotator. Future work should address computational time challenges to achieve further enhancements. This approach has significant potential to improve the efficiency and accuracy of orthodontic and orthognathic procedures.

Keywords: 3D facial images; convolutional neural networks; landmark annotation; mean localization error; orthodontics; orthognathic surgery.

MeSH terms

  • Anatomic Landmarks* / diagnostic imaging
  • Face* / anatomy & histology
  • Face* / diagnostic imaging
  • Female
  • Humans
  • Image Processing, Computer-Assisted / methods
  • Imaging, Three-Dimensional* / methods
  • Male
  • Neural Networks, Computer*