Decreasing speed and stride length over successive races have been shown to be associated with musculoskeletal injury (MSI) in racehorses, demonstrating the potential for early detection of MSI through longitudinal monitoring of changes in stride characteristics. A machine learning (ML) approach for early detection of MSI, enforced rest, and retirement events using this same horse-level, race-level, and stride characteristic data across all race sectionals was investigated. A CatBoost model using features from the two races prior to an event had the highest classification performance (sensitivity score for MSI, enforced rest and retirement equal to 0.00, 0.58, 0.76, respectively and balanced accuracy score corresponding to 0.44), with scores decreasing for models incorporating windows that included additional races further from the event. Feature importance analysis of ML models demonstrated that retirement was predicted by older age, poor performance, and longer racing career, enforced rest was predicted by younger age and better performance, but was less likely to occur when the stride length is increasing, and MSI predicted by increased number of starters, greater variation in speed and lower percentage of career time at rest. A relatively low classification performance highlights the difficulties in discerning MSI from alternate events using ML. Improved data recording through more thorough assessment and annotation of adverse events is expected to improve the predictability of MSI.
© 2024. The Author(s).