Detecting financial fraud using machine learning techniques

Document Type : Research Paper


1 Department of Accounting, Bonab Branch, Islamic Azad University, Bonab, Iran

2 Department of Accounting, Sofian Branch, Islamic Azad University, Sofian, Iran


Financial fraud detection is a challenging problem due to four primary reasons: the constantly changing fraudulent behavior, the lack of a mechanism to track fraud data, the specific limitations of available detection techniques (such as machine learning algorithms), and the highly dispersed financial fraud dataset. Thus, it can be declared that teaching algorithms are complex. The current study used machine learning techniques, including support vector machine regression and boosted regression tree, to detect financial fraud in the Iranian stock market. The findings indicated that the boosted regression tree machine model has the lowest RMSE. Furthermore, concerned with the sensitivity value of the models, the boosted regression tree model has the highest sensitivity in the sense that they had correctly detected the absence of financial fraud Tehran Stock Exchange market the Tehran Stock Exchange market. The boosted regression tree has the highest kappa coefficient indicating the appropriate performance of this model compared to other models used in the research.


[1] J.L. Abbot, Y. Park and S. Parker, The effects of audit committee activity and independence on corporate fraud, Manag. Finance 26 (2000), no. 11, 55–67.
[2] S.M. Abeare, Comparisons of boosted regression tree, GLM and GAM performance in the standardization of yellowfin tuna catch-rate data from the Gulf of Mexico Lonline fishery, MSc Thesis, Department of Oceanography and Coastal Sciences, Pretoria, 2009.
[3] T. Bell and J. Carcello, A decision aid for assessing the likelihood of fraudulent financial reporting, Audit.: J. Practice Theory 9 (2000), no. 1, 169–178.
[4] M. Beasley, J. Carcello, D. Hermanson and P. Lapides, Fraudulent financial reporting consideration of industry traits and corporate governance mechanisms, Account. Horizons 14 (2000), 113–136.
[5] M. Broghani, S. Pourhahashemi, M. Zarei and K. Aliabadi, Spatial modeling of the sensitivity of dust centers to its emission in east of Iran using BRT boosted regression tree model, Arid Regions Geog. Stud. 9 (2018), no. 35, 14–28.
[6] G. Camps-Valls, D. Tuia, L. Gomez-Chova, S. Jimenez and J. Malo, Remote Sensing Image Processing, Morgan & Claypool Publishers, 2012.
[7] P.K. Chan, W. Fan, A.L. Prodromidis and S.J. Stolfo, Distributed data mining in credit card fraud detection, IEEE Intel. Syst. Appl. 14 (1999), no. 6, 67–74.
[8] J. Elith, J.R. Leathwick and T. Hastie, A working guide to boosted regression trees, J. Animal Ecology 77 (2008), no. 4, 802–813.
[9] E. Feroz, K. Park and V. Pastens, The financial and market effects of the SECs accounting and auditing enforcements releases, J. Account. Res. 29 (2000), 42–107.
[10] A. Higson, Why is management reticent to teport fraud?, An exploratory study, 22nd Ann. Cong. Eur. Account. Assoc., Bordeaux, 1999.
[11] H. Kamrani and B. Abedini, Formulation of financial statement fraud detection model using artificial neural network and support vector machine approaches in companies listed in Tehran Stock Exchange, J. Manag. Account. Audit. Knowledge 11 (2022), no. 41, 285–314.
[12] A. Kornejady and H.R. Pourghasemi, Landslide susceptibility assessment using data mining models, a case study: Chehalis-Chai Basin, Watershed Engin. Mang. 11 (2019), no. 1, 28–42.
[13] E. Tashdidi, S. Sepasi, H. Etemadi and A. Azar, New approach to predicting and detecting financial statement fraud, using the bee colony, J. Account. Knowledge 10 (2018), no 3, 139–167.
Volume 15, Issue 1
January 2024
Pages 199-214
  • Receive Date: 19 April 2022
  • Revise Date: 30 May 2022
  • Accept Date: 22 July 2022