Computer-based plagiarism detection techniques: A comparative study

Document Type : Research Paper

Authors

1 Research and Development Department, Ministry of Higher Education and Scientific Research, Iraq

2 Department of Computer Science, College of Science, University of Baghdad, Baghdad, Iraq

Abstract

Plagiarism is becoming more of a problem in academics. It's made worse by the ease with which a wide range of resources can be found on the internet, as well as the ease with which they can be copied and pasted. It is academic theft since the perpetrator has "taken" and presented the work of others as his or her own. Manual detection of plagiarism by a human being is difficult, imprecise, and time-consuming because it is difficult for anyone to compare their work to current data. Plagiarism is a big problem in higher education, and it can happen on any topic. Plagiarism detection has been studied in many scientific articles, and methods for recognition have been created utilizing the Plagiarism analysis, Authorship identification, and Near-duplicate detection (PAN) Dataset 2009- 2011. Verbatim plagiarism, according to the researchers, plagiarism is simply copying and pasting. They then moved on to smart plagiarism, which is more challenging to spot since it might include text change, taking ideas from other academics, and translation into a more difficult-to-manage language. Other studies have found that plagiarism can obscure the scientific content of publications by swapping words, removing or adding material, or reordering or changing the original articles. This article discusses the comparative study of plagiarism detection techniques.

Keywords

[1] A. Abdi, N. Idris, R.M. Alguliyev and R.M. Aliguliyev, PDLK: Plagiarism detection using linguistic knowledge, Expert Syst. Appl. 42(22) (2015) 8936–8946.
[2] A. Abdi, S.M. Shamsuddin, N. Idris, R.M. Alguliyev and R.M. Aliguliyev, A linguistic treatment for automatic external plagiarism detection, Knowledge-Based Syst. 135 (2017) 135–146.
[3] M. Agrawal and D.K. Sharma, A state of art on source code plagiarism detection, Int. Conf. Next Gener. Comput. Technol. NGCT, 2016, pp. 236–241.
[4] R.A. Ahmed, Overview of Different Plagiarism Detection Tools, Int. J. Futurist. Trends Engin. Technol. 2(10) (2015) 2–4.
[5] L. Ahuja, V. Gupta and R. Kumar, A new hybrid technique for detection of plagiarism from text documents, Arab. J. Sci. Eng. 45(12) (2020) 9939–9952.
[6] F.K. Al-Jibory and M.S.H.A. Tamimi, Hybrid system for plagiarism detection on a scientific paper, Turk. J. Comput. Math. Educ. 12(13) (2021) 5707–5719.
[7] E.S. Al-Shamery and H.Q. Gheni, Plagiarism detection using semantic analysis, Indian J. Sci. Technol. 9(1) (2016) 1–8.
[8] S.M. Alzahrani, N. Salim and A. Abraham, Understanding plagiarism linguistic patterns, textual features, and detection methods, IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 42(2) (2012) 133–149.
[9] S. Awasthi, Plagiarism and academic misconduct: A systematic review, DESIDOC J. Libr. Inf. Technol. 39(2) (2019) 94–100.
[10] C. Basile, D. Benedetto, E. Caglioti, G. Cristadoro and M.D. Esposti, A plagiarism detection procedure in three steps: Selection, matches and squares, CEUR Workshop Proc. 502 (2009) 19–23.
[11] A.S. Bin-Habtoor and M.A. Zaher, A survey on plagiarism detection systems, Int. J. Comput. Theory Eng. 10(8) (2012) 185–188.
[12] H.A. Chowdhury and D.K. Bhattacharyya, Plagiarism: Taxonomy, tools and detection techniques, arXiv preprint arXiv:1801.06323, 2018.
[13] M. Duracık, E. Krsak and P. Hrkut,  Current trends in source code analysis, plagiarism detection and issues of analysis big datasets, Procedia Eng. 192 (2017) 136–141.
[14] D. Ga˜nan, Plagiarism Detection, Lect. Notes Data Eng. Commun. Technol., 34(2020) 19–40.
[15] J. Kasprzak, M. Brandejs and M. Kripac, Finding plagiarism By evaluating Document similarities, CEUR Workshop Proc. 502 (2009) 24–28.
[16] P. Gupta, K. Singhal, P. Majumder and P. Rosso, Detection of paraphrastic cases of mono-lingual and cross-lingual plagiarism, IR-Lab, DA-IICT, India, (2011) 1–6.
[17] V. Kumar, C. Bhatt and V. Namdeo, A framework for document plagiarism detection using Rabin Karp method, Int. J. Innov. Res. Technol. Manag. 3404(4) (2021) 17–30.
[18] S. Prasanth, R. Rajshree and B.S. Balaji. , A Survey on plagiarism detection, Int. J. Comput. Appl. 86(19) (2014) 21–23.
[19] M. Potthast, B. Stein, A. Barr´on-Cede˜no and P. Rosso, An evaluation framework for plagiarism detection, Coling 2010 - 23rd Int. Conf. Comput. Linguist. Proc. Conf., 2010, pp. 997–1005.
[20] A.H. Osman, N. Salim and A. Abuobieda, Survey of text plagiarism detection, Comput. Eng. Appl. J. 1(1) (2012) 37–45.
[21] M. Sahi and V. Gupta, A novel technique for detecting plagiarism in documents exploiting information sources, Cognit. Comput. 9(6) (2017) 852–867.
[22] A. Sediyono, K. Ruhana and K. Mahamud, Algorithm of the longest commonly consecutive word for plagiarism detection in text based document, 3rd Int. Conf. Digit. Inf. Manag. ICDIM, 2008 pp. 253–259.
[23] A. Sharma, V. Walia and M. Gahlawat, Review: Plagiarism an act of unethics, PharmaTutor Mag. 3(2) (2016) 20–23.
[24] D. Sraka and B. Kaucic, Source code plagiarism, Proc. Int. Conf. Inf. Technol. Interfaces, ITI, no. July, 2009 (2009) 461–466.
[25] A. Talebpour, M. Shirzadi Laskoukelayeh and Zahra Aminolroaya Plagiarism detection based on a novel trie-based approach, Forum Inf. Retriev. Eval. (2016) 109–117.
[26] K. Vani and D. Gupta, Study on extrinsic text plagiarism detection techniques and tools, J. Eng. Sci. Technol. Rev. 9(5) (2016) 9–23.
[27] K. Vani and D. Gupta, Text plagiarism classification using syntax based linguistic features, Expert Syst. Appl. 88 (2017) 448–464.
[28] K. Vani and D. Gupta, Unmasking text plagiarism using syntactic-semantic based natural language processing techniques: Comparisons, analysis and challenges, Inf. Process. Manag. 54(3) (2018) 408–43.
[29] J. Zhao, K. Xia, Y. Fu and B. Cui, An AST-based code plagiarism detection algorithm, Proc. 10th Int. Conf. Broadband Wirel. Comput. Commun. Appl. BWCCA, 2015, pp. 178–182.
[30] Https://pan.webis.de/, No Title, Https://pan.webis.de, 2011.
Volume 13, Issue 1
March 2022
Pages 3599-3611
  • Receive Date: 07 November 2021
  • Revise Date: 02 December 2021
  • Accept Date: 05 January 2022