Prediction of football match results by using artificial intelligence-based methods and proposal of hybrid methods

Document Type : Research Paper


1 Department of Actuerial Science, Selcuk University, 42250 Konya, Turkey

2 Department of Statistics, Selcuk University, 42250 Konya, Turkey


In this study, hybrid classification methods are proposed, and they are used to predict the results of future football matches. Our hybrid classification methods are introduced by using clustering and classification algorithms together. By developing a web scraping tool, data on 6396 football matches played in European leagues are collected. Unlike similar studies, the data includes fans’ opinions gathered from social media platforms in addition to statistical information about the teams and players. The raw data is transformed into suitable datasets through a software developed by authors, and the processed data is used in the classification analysis. The match result variable (dependent variable) is considered as three types denoted by MR-1, MR-2 and MR-3, respectively: The first one has three classes with Home, Draw and Away, the second one has two classes with Home and Draw-Away, the last one has also two classes Home-Draw and Away. The performances of the proposed hybrid methods are compared with the classification algorithms frequently used in the literature. As a result, our hybrid methods are more successful than classical classification algorithms. The prediction successes are 65.46% in the case of MR-1, 81.76% in the case of MR-2, and 77.8% in the case of MR-3.


[1] A.A. Abbasi and M. Younis, A survey on clustering algorithms for wireless sensor networks, Comput. Commun. 30 (2007), no. 14-15, 2826–2841.
[2] M.R. Abdullah, R.M. Musa, A.B.H.M.B. Maliki, N.A. Kosni, and P.K. Suppiah, Role of psychological factors on the performance of elite soccer players, J. Phys. Educ. Sport 16 (2016), no. 1, 170.
[3] C.C. Aggarwal, Data classification: algorithms and applications, CRC Press, 2014.
[4] F. Amadin and J.C. Obi, English premier league (epl) soccer matches prediction using an adaptive neuro-fuzzy inference system (anfis), Trans. Machine Learn. Artif. Intel. 3 (2015), no. 2, 34.
[5] W. Andreff and N. Scelles, Walter c. neale 50 years after:beyond competitive balance, the league standing effect tested with french football data, J. Sports Econ. 16 (2015), no. 8, 819–834.
[6] S.M. Arabzad, M.E. Tayebi Araghi, S. Sadi-Nezhad, and N. Ghofrani, Football match results prediction using artificial neural networks; the case of iran pro league, J. Appl. Res. Ind. Engin. 1 (2014), no. 3, 159–179.
[7] R. Baboota and H. Kaur, Predictive analysis and modelling football results using machine learning approach for english premier league, Int. J. Forecast. 35 (2019), no. 2, 741–755.
[8] G. Baio and M. Blangiardo, Bayesian hierarchical model for the prediction of football results, J. Appl. Statist. 37 (2010), no. 2, 253–264.
[9] G. Boshnakov, T. Kharrat, and I.G. McHale, A bivariate weibull count model for forecasting association football scores, Int. J. Forecast. 33 (2017), no. 2, 458–466.
[10] E.J. Cand`es, X. Li, Y. Ma, and J. Wright, Robust principal component analysis?, J. ACM (JACM) 58 (2011), no. 3, 11.
[11] A. Caraffa, G. Cerulli, M. Projetti, G. Aisa, and A. Rizzo, Prevention of anterior cruciate ligament injuries in soccer, Knee Surgery Sports Traumatol. Arthros. 4 (1996), no. 1, 19–21.
[12] A. Cortez, A. Trigo, and N. Loureiro, Predicting physiological variables of players that make a winning football team: A machine learning approach, Int. Conf. Comput. Sci. Appl., Springer, 2021, pp. 3–15.
[13] E. Costa, A. Lorena, A.C.P.L.F. Carvalho, and title = A review of performance evaluation measures for hierarchical classifiers booktitle = Evaluation Methods for Machine Learning II: papers from the AAAI-2007 Workshop pages = 1-6 type = Conference Proceedings Freitas, A.
[14] A. Decrop and C. Derbaix, Pride in contemporary sport consumption: A marketing perspective, J. Acad. Market. Sci. 38 (2010), no. 5, 586–603.
[15] M.J. Dixon and S.G. Coles, Modelling association football scores and inefficiencies in the football betting market, J. Royal Statist. Soc.: Ser. C (Applied Statistics) 46 (1997), no. 2, 265–280.
[16] M.J. Dixon and P.F. Pope, The value of statistical forecasts in the uk association football betting market, Int. J. Forecast. 20 (2004), no. 4, 697–711.
[17] Engin Esme and M.S. Kiran, Prediction of football match outcomes based on bookmaker odds by using k-nearest neighbor algorithm, Int. J. Machine Learn. Comput. 8 (2018), no. 1, 26–32.
[18] C.W. Fuller, J. Ekstrand, A. Junge, T.E. Andersen, R. Bahr, J. Dvorak, M. H¨agglund, P. McCrory, and W.H. Meeuwisse, Consensus statement on injury definitions and data collection procedures in studies of football (soccer) injuries, Scand. J. Med. Sci. Sports 16 (2006), no. 2, 83–92.
[19] J. Goddard, Regression models for forecasting goals and match results in association football, Int. J. Forecast. 21 (2005), no. 2, 331–340.
Volume 14, Issue 1
January 2023
Pages 2939-2969
  • Receive Date: 07 April 2022
  • Revise Date: 17 November 2022
  • Accept Date: 21 January 2023