AHP based feature ranking model using string similarity for resolving name ambiguity

Document Type : Research Paper

Authors

Department of Computer Applications, PSG College of Technology, Coimbatore, Tamilnadu, India

Abstract

In recent years of Natural Language Processing research, the name ambiguity problem remains unresolved while retrieving the information of author names from bibliographic citations in a digital library system. In this paper, a feature ranking model is investigated that resolve the ambiguity problem with Analytical Hierarchy Process (AHP). The AHP procedure prioritizes and assigns the weights for certain criteria which forms a judgemental matrix called pairwise comparison matrix. The result of the AHP analysis aims to get the preprocessing level using Levenshtein Distance. Finally, the AHP helps to find the co-author criteria as the highest priority than the other criteria taken from the digital library data set.

Keywords

[1] T. Anwar and M. Abulaishy, Namesake alias mining on the web and its role towards suspect tracking, Info. Sci.
276 (2014) 123–145.
[2] N. An, L. Jiang, J. Wang, P. Luo, M. Wang and B. N. Li, Towards detection of aliases without string similarity,
Info. Sci. 261 (2014) 89–100.
[3] R.G. Cota, A.A. Ferreira, C. Nascimento, M.A. Gon¸calves and A.H. Laender, An unsupervised heuristic-based
hierarchical method for name disambiguation in bibliographic citations, JASIST 61(9) (2010) 1853–1870.
[4] N. Fernandez, J.A. Fisteus, L. Sanchez and G. Lopez, Identity rank: named entity disambiguation in the news
domain, Expert Syst. Appl. 39(10) (2012) 9207–9221.
[5] A. Ferreira, M. Gon¸calves and A. Laender, A brief survey of automatic methods for author name disambiguation,
ACM SIGMOD Record 41 (2012) 15–26.
[6] F.H. Levin and C.A. Heuser, Evaluating the use of social networks in author name disambiguation in digital
libraries, J. Inf. Data Manag.1(2) (2010) 183–198.
[7] B. Madhuri, S.T. Rao, M. Padmaja and A.J. Chandulal, Evaluating website based on the Grey clustering theory
combined with AHP, Int. J. Eng. Tech. 2(2) (2010) 71-76.
[8] Q. Shen and T. Boongoen, Fuzzy orders-of-magnitude-based link analysis for qualitative alias detection, IEEE
Trans. Knowledge Data Engin. 24(4) (2012) 649–663.
[9] J. Tang, A.C.M. Fong, B. Wang and J. Zhang, A unified probabilistic framework for name disambiguation in
digital library, IEEE Trans. Knowledge Data Engin. 24(6) (2012) 975–987.
[10] O. Vechtomova and S.E. Robertson, A domain-independent approach to finding related entities, Inf. Proces.
Manag. 48(4) (2012) 654–670.
[11] A. Veloso, A.A. Ferreira, M.A. Gon¸calves, A.H. Laender and W.Jr. Meira, Cost-effective on-demand associative
author name disambiguation, Inf. Proces. Manag. 48(4) (2012) 680–697.
[12] H. Wu, B. Li, Y. Pei and J. He, Unsupervised author disambiguation using Dempster–Shafer theory, Scientomet.
101 (2014) 1955–1972.
[13] J. Zhu, A Multiple-Layer Clustering Approach to the Name Ambiguity Problem, School of ITEE, University of
Queensland, Australia, Fang Yang, School of Business, Renmin University of China, China.
Volume 12, Special Issue
December 2021
Pages 1745-1751
  • Receive Date: 10 October 2021
  • Revise Date: 02 November 2021
  • Accept Date: 24 November 2021