Applying Unsupervised Machine Learning Models to Identify Serve Performance Related Indicators in Women's Volleyball.

Miguel Á Casimiro-ArtésRaúl Hileno Antonio Garcia de Alcaraz Serrano

Published in: Research quarterly for exercise and sport (2023)

In volleyball, the effect of different factors on serve performance has usually been analyzed with traditional statistical techniques such as logistic regression or discriminant analysis. Purpose: In this study, two of the main models used in unsupervised machine learning (cluster and principal component analysis) were applied to achieve these objectives: (a) to create groups of players considering their serve coefficient, age, height, and team ranking, and (b) to identify which variables related to the serve (type and performance), the players (role, age, and height), and the teams (ranking, match location, and quality of opposition) most explained the total variance of the data during an entire women's volleyball season. Method: A total of 20,936 serves were analyzed during the 132 matches played in the 2017-2018 season in the Liga Iberdrola (women Spanish first division). The variables were related to the serving action (type of serve and performance), the players' traits (player role, age, and height), and the teams' characteristics (final ranking, match location, quality of opposition, and tournament). Results: Cluster analysis showed five groups of players differing in age, serve coefficient, team ranking, and height. Principal component analysis showed how the first five components explained 72.12% of the total variance. From these components, serve coefficient, team ranking, match location, quality of opposition, and player role each contributed more than 10%. Conclusions: These findings can help coaches to improve talent selection and players' development according to competition demands.

Keyphrases