Identifying animal behaviours from accelerometers: Improving predictive accuracy of machine learning by refining the variables selected, data frequency, and sample duration

Ecol Evol. 2024 May 16;14(5):e11380. doi: 10.1002/ece3.11380. eCollection 2024 May.

Abstract

Observing animals in the wild often poses extreme challenges, but animal-borne accelerometers are increasingly revealing unobservable behaviours. Automated machine learning streamlines behaviour identification from the substantial datasets generated during multi-animal, long-term studies; however, the accuracy of such models depends on the qualities of the training data. We examined how data processing influenced the predictive accuracy of random forest (RF) models, leveraging the easily observed domestic cat (Felis catus) as a model organism for terrestrial mammalian behaviours. Nine indoor domestic cats were equipped with collar-mounted tri-axial accelerometers, and behaviours were recorded alongside video footage. From this calibrated data, eight datasets were derived with (i) additional descriptive variables, (ii) altered frequencies of acceleration data (40 Hz vs. a mean over 1 s) and (iii) standardised durations of different behaviours. These training datasets were used to generate RF models that were validated against calibrated cat behaviours before identifying the behaviours of five free-ranging tag-equipped cats. These predictions were compared to those identified manually to validate the accuracy of the RF models for free-ranging animal behaviours. RF models accurately predicted the behaviours of indoor domestic cats (F-measure up to 0.96) with discernible improvements observed with post-data-collection processing. Additional variables, standardised durations of behaviours and higher recording frequencies improved model accuracy. However, prediction accuracy varied with different behaviours, where high-frequency models excelled in identifying fast-paced behaviours (e.g. locomotion), whereas lower-frequency models (1 Hz) more accurately identified slower, aperiodic behaviours such as grooming and feeding, particularly when examining free-ranging cat behaviours. While RF modelling offered a robust means of behaviour identification from accelerometer data, field validations were important to validate model accuracy for free-ranging individuals. Future studies may benefit from employing similar data processing methods that enhance RF behaviour identification accuracy, with extensive advantages for investigations into ecology, welfare and management of wild animals.

Keywords: activity budget; biologging; domestic cat; machine learning; predictive accuracy; random forest model.

Associated data

  • Dryad/10.5061/dryad.q2bvq83sx