Recommendation engines-neural embedding to graph-based: Techniques and evaluations

Document Type : Research Paper


1 Department of Computer science and Engineering, Jamia Hamdard, New Delhi, India

2 Faculty of Computer Science and Mathematics, University of Kufa, Iraq


The goal of any profit organization is to bolster its revenue by providing useful suggestions to its customer base. In order to achieve this, vast research is being undertaken by companies such as Netflix and Amazon on their Recommendation Systems and providing users with choices, they are most likely to click on. The purpose of this paper is to provide a holistic view of types of Recommendation Engines and how they are implemented, scaled and can provide a basis for revenue generation. The focus would be to implement a Recommendation Engine on PySpark using the ALS (Alternate Least Square) method. Besides, Neo-4j and Cypher query language for implementing recommendations on a graph database and analyzing how heterogeneous information can be levied to tackle the infamous cold start problem in recommender engines would be explored. The dataset used for analysis is the Group-lens 100K Movie-lens dataset and the algorithm is implemented to best fit the dataset. Further, an in-depth comparison of several techniques has been carried out on the basis of different metrics, hyper-parameter selection and the number of epochs used. The claims have been justified by evaluating the performance of the model depending on the different use cases, thus aiding in predictive analytics of the movie, as per the interest of the customer using visualization tools.


[1] G. Adomavicius and A. Tuzhilin, Toward the next generation of recommender system: A survey of the state-ofthe-art and possible extensions, IEEE Trans. Knowledge and Data Engin. 17(6) (2005) 734–749.
[2] C.C. Aggarwal, Recommender Systems: The Textbook, Cham. Springer Publishing Company, 1, 2016.
[3] W. Ali, S.U. Din, A.A. Khan, S. Tumrani, X. Wang and J. Shao, Context-aware collaborative filtering framework
for rating prediction based on novel similarity estimation, Comput. Mater. Continua 63(2) (2020) 1065–1078.
[4] S. Bhatia, R. Madan, S.L. Yadav and K.K. Bhatia, An algorithmic approach based on principal component analysis
for aspect-based opinion summarization, Int. Conf. Comput. Sustainable Global Develop. (2019) 874–879.
[5] C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 1st (reprint) ed., New York, 2006.
[6] M. Bressan, S. Leucci, A. Panconesi, P. Raghavan and E. Terolli, The limits of popularity-based recommendations,
and the role of social ties, Int. Conf. Knowledge Discovery and Data Mining (2016) 745–754.
[7] M. Ebady Manaa, A.J. Obaid and M.H. Dosh, Unsupervised approach for email spam filtering using data mining,
EAI Endorsed Trans. Energy Web 8(36) (2021).
[8] Z. Gulzar, A.A. Leema and G. Deepak, PCRS: personalized course recommender systems based on hybrid approach,
Proceedia Computer Sci. 125 (2018) 518–524.
[9] A. Gunawardana and G. Shani, Evaluating Recommender Systems, Recommender Systems Handbook, Boston,
MA, Springer, 2015.
[10] I. Hariyale and M.M. Raghuwanshi, Design of recommender system using content based filtering and collaborative
filtering technique: a comparative study, Int. J. Adv. Sci. Tech. 29(05) (2020) 4852–4865.
[11] F.O. Isinkaye, Y.O. Folajimi and B.A. Ojokoh, Recommendation systems: principles, methods and evaluation,
Egyptian Inf.J. 16(3) (2015) 261–273.
[12] A. Kennedy and D. Inkpen, Sentiment classification of movie and product reviews using contextual valence shifters,
Comput. Intell. 22(2) (2006) 110–125.
[13] S.S. Khatri, D. Singh, B. Narain, S. Bhatia, M.T. Quasim and G.R. Sinha, An empirical analysis of machine
learning algorithms for crime prediction using stacked generalization: an ensemble approach, IEEE Access 9
(2021) 67488–67500.
[14] D.P. Kingma and J. Ba, Adam: A method for stochastic optimization, The 3rd Int. Conf. Learning Represent.
ICLR 2015, (2015).
[15] X. Liang, X. Zhonghang, P. Liping, L. Zhang, H. Zhang, Measure prediction capability of data for collaborative
filtering, Knowledge and Inf. Syst. 49(3) (2016) 975–1004.
[16] N. Maxim, D. Mudigere, H.J.M. Shi, J. Huang and N. Sundaraman, J. Park, X. Wang, U. Gupta, C.-J. Wu, A.G.
Azzolini, D. Dzhulgakov, A. Mallevich, I. Cherniavskii, Y. Lu, R. Krishnamoorthi, A. Yu, V. Kondratenko, S.
Pereira, X. Chen, W. Chen, V. Rao, B. Jia, L. Xiong and M. Smelyanskiy, Deep learning recommendation model
for personalization and recommendation systems, Conf. Workshop on Neural Inf. Proc. Syst. (2019) 1–10.
[17] F.H. Maxwell and J.A. Konstan, The movielens datasets: history and context, ACM Trans.Interact.Intell. Syst.
5(4) (2016) 1–19.
[18] X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D.B. Tsai, M. Amde, S. Owen,
D. Xin, R. Xin, M.J. Franklin, R. Zadeh, M. Zaharia and A. Talwalkar, Mllib: Machine learning in apache spark,
J. Machine Learn. Res. 17(1) (2016) 1235–1241.
[19] T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado and J. Dean, Distributed representations of words and phrases
and their compositionality, Adv. Neural Inf. Proc. Syst. (2013) 3111–3119.
[20] A.J. Obaid, K.A. Alghurabi, S.A.K. Albermany and S. Sharma, Improving Extreme Learning Machine Accuracy
Utilizing Genetic Algorithm for Intrusion Detection Purposes, In: R. Kumar, N.H. Quang, V.K. Solanki, M.
Cardona and P.K. Pattnaik (eds), Research in Intelligent and Computing in Engineering, Adv. Intell. Syst.
Comput. 1254 (2021).
[21] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVit, Z. Lin, A. Desmaison, L. Antiga and A. Lerer,
Automatic differentiation in pytorch, Proc. Neural Inf. Proces. Syst. (2017) 1–4.[22] N.S. Patil, P. Kiran, N.P. Kavya and K.M. Patel, A survey on graph database management techniques for huge
unstructured data, Int. J. Elect. Comput. Engin. 8(2) (2018) 1140–1149.
[23] S. Puglisi, J.P. Arnau, J. Forn´e and D. R. Monedero, On content-based recommendation and user privacy in
social-tagging systems, Computer Standards & Interfaces, 41 (2015) 17–27.
[24] S. Sen, A. Mehta, R. Ganguli and S. Sen, Recommendation of influenced products using association rule mining:
neo4j as a case study, SN Compu. Sci., 2(2) (2021) 1–17.
[25] A. Sharaff and M. Choudhary, Comparative analysis of various stock prediction techniques, Int. Conf. Trends
Elect. Inf. (2018) 735–738.
[26] A. Sharaff and U. Srinivasarao, Towards classification of email through selection of informative features, Int. Conf.
Power, Control Comput.Technol. (2020) 316–320.
[27] P.K. Singh, P.K.D. Pramanik, A.K. Dey and P. Choudhury, Recommender systems: an overview, research trends,
and future directions, Int. J.Business Syst.Res. 15(1) (2021) 14–52.
[28] J. Wei, J. He, K. Chen, Y. Zhou and Z. Tang, Collaborative filtering and deep learning based recommendation
system for cold start items, Expert Syst. Appl. 69 (2017) 29–39.
[29] L. Wu, Q. Liu, E. Chen, N.J. Yuan, G. Guo and X. Xie, Relevance meets coverage: a unified framework to
generate diversified recommendations, ACM Trans. Intell. Syst.Technol. 7(3) (2016) 1–30.
[30] G. Xu, T. Zhijing, M. Chuang, L. Yanbing, D. Mahmoud, A collaborative filtering recommendation algorithm
based on user confidence and time context, J. Elect.Comput. Engin. 2019 (2019) 1–12.
[31] J. Yangqing, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama and T. Darrell, Caffe:
convolutional architecture for fast feature embedding, Int. Conf. Multimedia (2014) 675–678.
[32] D. Yashar, A. Bellogin and T.D. Noia, Explaining recommender systems fairness and accuracy through the lens
of data characteristics, Inf. Proces. Manag. 58(5) (2021) 102662–102686.
[33] R.B. Yates and B.R. Neto, Modern Information Retrieval, ACM press, New York, 1999.
[34] Z. Yunhong, D. Wilkinson, R. Schreiber and R. Pan, Large-scale parallel collaborative filtering for the netflix
prize, Proc. Algorithmic Appl. Manag.(2008) 337–348.
Volume 13, Issue 1
March 2022
Pages 2411-2423
  • Receive Date: 18 September 2021
  • Revise Date: 09 October 2021
  • Accept Date: 19 November 2021
  • First Publish Date: 11 December 2021