Diving into a pool of data: Using principal component analysis to optimize performance prediction in women's short-course swimming

J Sports Sci. 2024 Mar;42(6):519-526. doi: 10.1080/02640414.2024.2346670. Epub 2024 May 5.

Abstract

This study aimed to optimise performance prediction in short-course swimming through Principal Component Analyses (PCA) and multiple regression. All women's freestyle races at the European Short-Course Swimming Championships were analysed. Established performance metrics were obtained including start, free-swimming, and turn performance metrics. PCA were conducted to reduce redundant variables, and a multiple linear regression was performed where the criterion was swimming time. A practical tool, the Potential Predictor, was developed from regression equations to facilitate performance prediction. Bland and Altman analyses with 95% limits of agreement (95% LOA) were used to assess agreement between predicted and actual swimming performance. There was a very strong agreement between predicted and actual swimming performance. The mean bias for all race distances was less than 0.1s with wider LOAs for the 800 m (95% LOA -7.6 to + 7.7s) but tighter LOAs for the other races (95% LOAs -0.6 to + 0.6s). Free-Swimming Speed (FSS) and turn performance were identified as Key Performance Indicators (KPIs) in the longer distance races (200 m, 400 m, 800 m). Start performance emerged as a KPI in sprint races (50 m and 100 m). The successful implementation of PCA and multiple regression provides coaches with a valuable tool to uncover individual potential and empowers data-driven decision-making in athlete training.

Keywords: Athlete training; data-driven insights; key performance indicators; performance metrics.

MeSH terms

  • Athletic Performance* / physiology
  • Competitive Behavior / physiology
  • Female
  • Humans
  • Linear Models
  • Principal Component Analysis*
  • Swimming* / physiology