Recognizing phishing websites based on a bayesian combiner

Document Type : Research Paper

Authors

1 Department of Electrical Engineering, Shams Higher Education Institute, Iran.

2 Department of Computer Engineering, West Tehran Branch, Islamic Azad University, Tehran, Iran.

3 Department of Electrical Engineering, Islamic Azad University, Garmsar Branch, Semnan, Iran.

4 Department of Electrical Engineering, Technical and Vocational University (TVU), Tehran, Iran.

Abstract

Phishing is a social engineering technique used to deceive users, which means trying to obtain confidential information such as username, password or bank account information. One of the most important challenges on the Internet today is the risk of phishing attack and Internet scams. These attacks cost the United States billions of dollars a year. Therefore, researchers have made great efforts to identify and combat such attacks. Accordingly, the present study aims to evaluate the methods of identifying phishing websites. This research is applied in terms of its objectives and descriptive-analytical in nature. In this article, the classification approach is used to identify phishing websites. From a machine learning point of view, if a suitable strategy is used, the ensemble of votes of different classifiers can be used to increase the accuracy of classification. In the method proposed in this paper, three inherently different ensemble classifiers, called bagging, AdaBoost, and rotation forest are employed. In this method, the stacked generalization strategy is used as an ensemble strategy. A relatively new dataset is employed to evaluate the performance of the proposed method. The database was added to the UCI Database in 2015 and uses 30 features that appear to be appropriate for distinguishing phishing and non-phishing websites. The present study uses 10-fold-cross-validation method as an evaluation strategy. The numerical results indicate that the proposed method can be used as a promising method for detecting phishing websites. It is worth mentioning that in this method, an F-score of 96.3 is resulted, which is a good result in detecting phishing.

Keywords

[1] R.O. Duda, P.E. Hart and D.G. Stork, Pattern Classification, (2001).
[2] K. Dunham, Mobile Malware Attacks and Defense, (2009). Retrieved from:
http://www.sciencedirect.com/science/book/9781597492980.
[3] Gartner. (2016). (Gartner) Retrieved from: http://www.gartner.com.
[4] H. Ghayoumi Zadeh, A. Montazeri, I. Abaspur Kazerouni and J. Haddadnia, Clustering and screening for breast
cancer on thermal images using a combination of SOM and MLP, Computer Methods in Biomechanics and
Biomedical Engineering: Imaging & Visualization, 5(1) (2017) 68-76.
[5] M. Ghane, AR. Nejad, M. Blanke, Z. Gao and T. Moan, Statistical fault diagnosis of wind turbine drivetrain
applied to a 5MW floating wind turbine, Journal of Physics: Conference Series 753 (5) (2017).
[6] M. Ghane and MJ. Tarokh, Multi-objective design of fuzzy logic controller in supply chain, Journal of Industrial
Engineering International 8 (1), 1-8.
[7] M. Ghane, M. Zarvandi and MR. Yousefi, attenuating bullwhip effect using robust-intelligent controller, 2010 5th
IEEE International Conference Intelligent Systems, (2010) 309-314.[8] M. Khonji, Y. Iraqi and A. Jones, Phishing Detection: A Literature Survey, Ieee Communications Surveys &
Tutorials, 15(4) (2013). Retrieved from: doi:10.1109/SURV.2013.032213.00009.
[9] E. Koozegar, M. Soryani and I. Domingues, A New Local Adaptive Mass Detection Algorithm in Mammograms,
BIOSIGNALS. 2013.
[10] E. Kozegar, et al, Computer aided detection in automated 3-D breast ultrasound images: a survey, Artificial
Intelligence Review (2019) 1-23.
[11] P. Likarish, D. Dunbar and T. E. Hansen, B-apt: Bayesian antiphishing toolbar. IEEE International Conference
on Communications, (2008) 1745 –1749. Retrieved from: doi:10.1109/ICC.2008.335.
[12] G. Liu, B. Qiu and L. Wenyin, Automatic detection of phishing target from phishing webpage, (2010) 4153 –4156.
Retrieved from: doi:10.1109/ICPR.2010.1010.
[13] O. Rahmani Seryasat and J. Haddadnia. Evaluation of a new ensemble learning framework for mass classification
in mammograms, Clinical breast cancer 18.3 (2018) e407-e420.
[14] O. Rahmani Seryasat, J. Haddadnia and H. Ghayoumi-Zadeh, A new method to classify breast cancer tumors and
their fractionation, Ciˆencia e Natura, 37(4) (2015) 51-57.
[15] O. Rahmani Seryasat, J Haddadnia and H. Ghayoumi Zadeh, Assessment of a Novel Computer Aided Mass
Diagnosis System in Mammograms, Iranian Journal of Breast Disease 9 (3) (2016) 31-41.
[16] O. Rahmani Seryasat and J. Haddadnia. Assessment of a novel computer aided mass diagnosis system in mammograms, Biomedical Research 28 (7) (2017).
[17] O. Rahmani Seryasat, I. Kor and H. Ghayoumi Zadeh, Predicting the number of comments on Facebook posts
using an ensemble regression model, International Journal of Nonlinear Analysis and Applications, 12 (2021)
49-62.
[18] M. Rami, T.L. McCluskey and A. Thabtah Fadi, Intelligent Rule based Phishing Websites Classification, IET
Information Security, 8 (2014).
[19] S.M. Sheikholeslam Noori, M. Taeibi Rahni and S.A. Shams Taleghani, Multiple-relaxation time color-gradient
lattice Boltzmann model for simulating contact angle in two-phase flows with high density ratio, European Physical
Journal Plus, 134(8) (2019) 399.
[20] A. Salmasi, A. Shadaram and A.S. Taleghani, Effect of plasma actuator placement on the airfoil efficiency at
poststall angles of attack, IEEE Transactions on Plasma Science, 41(10) (2013) 3079–3085.
[21] Symantec, Internet Security Threat Report, (2014). Retrieved from: https://www.symantec.com/securitycenter/threat-report.
[22] A.S. Taleghani, A. Shadaram, M. Mirzaei, S. Abdolahipour, Parametric study of a plasma actuator at unsteady
actuation by measurements of the induced flow velocity for flow control, Journal of the Brazilian Society of
Mechanical Sciences and Engineering, 40(4) (2018) 173.
[23] C. Whittaker, B. Ryner and M. Nazif, Large-scale automatic classification of phishing pages, 10 (2010). Retrieved
from http://www.internetsociety.org/sites/default/files/whit.pdf.
[24] I. Zare, A. Ghafarpour, H. Ghayoumi Zadeh, J. Haddadnia and S.M. Mostafavi Isfahani, Evaluating the thermal
imaging system in detecting certain types of breast tissue masses, (2016).
[25] H. Zhang, G. Liu, T. Chow and W. Liu, Textual and visual contentbased anti-phishing: A bayesian approach,
IEEE Transactions on Neural Networks, 22 (2011) 1532 –1546. Retrieved from: doi:10.1109/TNN.2011.2161999.
Volume 12, Special Issue
December 2021
Pages 809-823
  • Receive Date: 07 July 2021
  • Accept Date: 22 September 2021