Height prediction of individuals with osteogenesis imperfecta by machine learning

Orphanet J Rare Dis. 2024 Nov 9;19(1):420. doi: 10.1186/s13023-024-03433-1.

Abstract

Background: Osteogenesis imperfecta (OI) is a genetic disorder characterized by low bone mass, bone fragility and short stature. There is a significant gap in knowledge regarding the growth patterns across different types of OI, and the prediction of height in individuals with OI was not adequately addressed. In this study, we described the growth patterns and predicted the height of individuals with OI employing multiple machine learning (ML) models. Accurate height prediction enables effective monitoring and facilitates the development of personalized intervention plans for managing OI.

Method: This study included cross-sectional data for 323 participants with OI, and the median height Z-score for OI types I, III and IV were - 0.62 (-5.93 ~ 3.24), -3.97 (-10.44 ~ -0.02) and - 1.64 (-6.67 ~ 2.44), respectively. Based on the cross-sectional data of participants, the height curves across different gender and OI types were plotted and compared. Subsequently, feature selection techniques, specifically the filter and wrapper methods, were employed to identify predictive factors for the height of participants. Finally, multiple machine learning (ML) models were constructed for height prediction, and the performance of each model was systematically evaluated.

Results: The analysis of height curves revealed that male with OI are significantly taller than female with OI from the age of 14 (p = 0.045), individuals with OI type III are statistically shorter than those with OI types I and IV starting from 3 years old (p = 0.006), and those with OI type IV are statistically shorter than those with OI type I from the age of 10 (p = 0.028). The application of filter and wrapper methods identified gender (p = 0.001), age (p < 0.001), Sillence types (p = 0.007), weight Z-score (p < 0.001) and aBMD Z-score (p = 0.021) as significant predictive factors for height. The optimal performance of predictive models was registered by gradient boosting classifier (GB) (bias = 5.783, accuracy = 92.59%, R2 = 0.828), random forest (RF) (bias = 6.155, accuracy = 90.12%, R2 = 0.788), ensemble machine learning (EML) (bias = 6.250, accuracy = 91.36%, R2 = 0.825) and deep neuron networks (DNNs) (bias = 6.223, accuracy = 90.12%, R2 = 0.821).

Conclusion: This study analyzed a large cohort of individuals with OI and provided detailed height patterns across different gender and OI types that are crucial for assessing overall growth. Gender, age, Sillence types, weight Z-score and aBMD Z-score were identified as predictive factors for height. The predictive models of GB, RF, EML and DNNs had higher accuracy to evaluate the height of individuals with OI. This study allows guardians and physicians to timely monitor the height parameters, and facilitate the creation of personalized intervention schedules tailored to the needs of individuals with OI.

Keywords: Deep neural networks; Ensemble learning; Growth curves; Machine learning; Osteogenesis imperfecta; Prediction.

MeSH terms

  • Adolescent
  • Adult
  • Body Height* / physiology
  • Child
  • Child, Preschool
  • Cross-Sectional Studies
  • Female
  • Humans
  • Infant
  • Machine Learning*
  • Male
  • Osteogenesis Imperfecta* / pathology
  • Young Adult