Document Type : Research Paper
Authors
1 Department of Actuerial Science, Selcuk University, 42250 Konya, Turkey
2 Department of Statistics, Selcuk University, 42250 Konya, Turkey
Abstract
In this study, hybrid classification methods are proposed, and they are used to predict the results of future football matches. Our hybrid classification methods are introduced by using clustering and classification algorithms together. By developing a web scraping tool, data on 6396 football matches played in European leagues are collected. Unlike similar studies, the data includes fans’ opinions gathered from social media platforms in addition to statistical information about the teams and players. The raw data is transformed into suitable datasets through a software developed by authors, and the processed data is used in the classification analysis. The match result variable (dependent variable) is considered as three types denoted by MR-1, MR-2 and MR-3, respectively: The first one has three classes with Home, Draw and Away, the second one has two classes with Home and Draw-Away, the last one has also two classes Home-Draw and Away. The performances of the proposed hybrid methods are compared with the classification algorithms frequently used in the literature. As a result, our hybrid methods are more successful than classical classification algorithms. The prediction successes are 65.46% in the case of MR-1, 81.76% in the case of MR-2, and 77.8% in the case of MR-3.
Keywords