A new gated multi-scale convolutional neural network architecture for recognition of Persian handwritten texts

Khosravi, Sara; Chalechale, Abdolah

doi:10.22075/ijnaa.2023.31634.4694

A new gated multi-scale convolutional neural network architecture for recognition of Persian handwritten texts

Document Type : Research Paper

Authors

Department of Computer Engineering and Information Technology, Razi University, Kermanshah, Iran

10.22075/ijnaa.2023.31634.4694

Abstract

Due to the ease of writing by hand and the inherent interest in it, writing by hand is still popular among many people. Considering the digitization of today's world and the massive amount of current information on paper, there is a need for a system to convert handwriting into its digital form to speed up access to information and reduce storage space. According to the research carried out in this field, recognizing Persian handwritten texts remains a relatively difficult issue due to the complex and irregular nature of writing and the diversity of people's handwriting. This research introduces a novel method to recognize handwritten texts at the sentence level. To use word recognition methods in sentence recognition, segmentation techniques are needed to separate the words in the sentence. The segmentation algorithm in handwritten texts is inefficient due to overlapping words. Since Recurrent Neural Networks (RNN) were a turning point in the recognition of correct writing, in this article, by removing the segmentation step, a new architecture, an RNN combined with a Gated Multi-scale Convolutional Neural Network (GMCNN), is introduced in order to recognize handwritten sentences. Using the proposed architecture, recognizing Persian handwritten sentences in the Sadri dataset has a character error rate of 2.99%, a word error rate of 6.67%, and a sentence error rate of 36.87%. For further evaluation, the proposed method was also evaluated on IAM and Washington datasets. The results show that the proposed method outperforms other known algorithms.

Keywords

References

[1] A.A. Aburas and S.M. Rehiel, Off-line Omni-style handwriting Arabic character recognition system based on wavelet compression, Arab Res. Institute Sci. Eng. 3 (2007), no. 4, 123–135.

[2] J. Aradillas, J. Murillo-Fuentes, and P. Olmos, Boosting offline handwritten text recognition in historical documents with few labeled lines, IEEE Access, 9 (2021), 76674–76688.

[3] D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, arXiv preprint arXiv:1409.0473, (2014).

[4] D. Banerjee, P. Bhowal, S. Malakar, E. Cuevas, M. P´erez-Cisneros, and R. Sarkar, Z-transform-based profile matching to develop a learning-free keyword spotting method for handwritten document images, Int. J. Comput. Intell. Syst. 15 (2022), no. 1, 93.

[5] A. Chaudhuri, K. Mandaviya, P. Badelia, and S.K. Ghosh, Optical Character Recognition Systems, Springer International Publishing, 2017.

[6] K.-N. Chen, C.-H. Chen, and C.-C. Chang, Efficient illumination compensation techniques for text images, Digital Signal Process. 22 (2012), no. 5, 726–733.

[7] X. Chen, L. Jin, Y. Zhu, C. Luo, and T. Wang, Text recognition in the wild: A survey, ACM Comput. Surveys 54 (2021), no. 2, 1–35.

[8] A. Chowdhury and L. Vig, An efficient end-to-end neural model for handwritten text recognition, arXiv preprint arXiv:1807.07965, (2018).

[9] Y.N. Dauphin, A. Fan, M. Auli, and D. Grangier, Language modeling with gated convolutional networks, Int. Conf. Machine Learn., (2017), 933–941.

[10] H. El Bahi and A. Zatni, Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network, Multimed. Tools Appl. 78 (2019), no. 18, 26453–26481.

[11] A. Fischer, E. Inderm¨uhle, H. Bunke, G. Viehhauser, and M. Stolz, Ground truth creation for handwriting recognition in historical documents, Proc. 9th IAPR Int. Workshop Document Anal. Syst., 2010, pp. 3–10.

[12] R. Geetha, T. Thilagam, and T. Padmavathy, Effective offline handwritten text recognition model based on a sequence-to-sequence approach with CNN–RNN networks, Neural Comput. Appl. 33 (2021), no. 17, 10923–10934.

[13] K. He, X. Zhang, S. Ren, and J. Sun, Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification, IEEE Int. Conf. Comput. Vision (ICCV), 2015, pp. 1026–1034.

[14] S. Ioffe, Batch renormalization: Towards reducing minibatch dependence in batch-normalized models, Adv. Neural Inf. Process. Syst. 30 (2017).

[15] L. Kang, P. Riba, M. Rusinol, A. Fornes, and M. Villegas, Pay attention to what you read: Non-recurrent handwritten text-Line recognition, Pattern Recogn. 129 (2022), 108766.

[16] H. Karimi, A. Esfahanimehr, M. Mosleh, F.M.J. Ghadam, S. Salehpour and O. Medhati, Persian handwritten digit recognition using ensemble classifiers, Proc. Comput. Sci. 73 (2015), 416–425.

[17] B.R. Kavitha and C. Srimathi, Benchmarking on offline handwritten Tamil character recognition using convolutional neural networks, J. King Saud Univ.-Comput. Info. Sci. 34 (2022), no. 4, 1183–1190.

[18] D.P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980, (2014).

[19] V. Kukreja, A retrospective study on handwritten mathematical symbols and expressions: Classification and recognition, Eng. Appl. Artif. Intell. 103 (2021), 104292.

[20] A. Kumar, S. Sarkar and C. Pradhan, Malaria disease detection using CNN technique with SGD, rmsprop and adam optimizers, S. Dash, B. Acharya, M. Mittal, A. Abraham and A. Kelemen (eds), Deep learning techniques for biomedical and health informatics, Studies in Big Data: Springer, 2020, pp. 211–230.

[21] U.V. Marti and H. Bunke, The IAM-database: An English sentence database for offline handwriting recognition, Int. J. Doc. Anal. Recog. 5 (2002), no. 1, 39–46.

[22] S. Nasrollahi and A. Ebrahimi, Printed Persian subword recognition using wavelet packet descriptors, J. Eng. 2013 (2013), 1–11.

[23] A.F. Neto, B.L. Bezerra, and A.H. Toselli, Towards the natural language processing as spelling correction for offline handwritten text recognition systems, Appl. Sci. 10 (2020), no. 21.

[24] X. Qu, W. Wang, K. Lu, and J. Zhou, Data augmentation and directional feature maps extraction for in-air handwritten Chinese character recognition based on convolutional neural network, Pattern Recog. Lett. 111 (2018), 9–15.

[25] J. Sadri, M.R. Yeganehzad, and J. Saghi, A novel comprehensive database for offline Persian handwriting recognition, Pattern Recogn. 60 (2016), 378–393.

[26] G. Sarker, M. Besra, and S. Dhua, A programming-based handwritten text identification, Int. Conf. Adv. Comput. Engin. Appl., 2015, pp. 472–477.

[27] H. Scheidl, Handwritten text recognition in historical documents, PhD diss., Wien, 2018.

[28] P. Shirvani, M. Vatankhah Khouzani, and K. Yaghmaie, Persian text recognition using n-gram language models and grammatical refinement, JSDP 11 (2014), no. 1, 107–115.

[29] J. Sueiras, V. Ruiz, A. Sanchez, and J.F. Velez, Offline continuous handwriting recognition using sequence to sequence neural networks, Neurocomputing 289 (2018), 119–128.

[30] O. Surinta, M.F. Karaaba, L.R. Schomaker, and M.A. Wiering, Recognition of handwritten characters using local gradient feature descriptors, Eng. Appl. Artif. Intell. 45 (2015), 405–414.

[31] G. Tong, Y. Li, H. Gao, H. Chen, H. Wang, and X. Yang, MA-CRNN: A multi-scale attention CRNN for Chinese text line recognition in natural scenes, Int. J. Document Anal. Recog. 23 (2020), no. 2, 103–114.

[32] A. Vinciarelli and J. Luettin, A new normalization technique for cursive handwritten words, Pattern Recog. Lett. 22 (2001), no. 9, 1043–1050.

[33] X. Wang, A. Bao, Y. Cheng, and Q. Yu, Weight-sharing multi-stage multi-scale ensemble convolutional neural network, Int. J. Machine Learn. Cybernet. 10 (2019), no. 7, 1631–1642.

[34] H. Wu and X. Gu, Towards dropout training for convolutional neural networks, Neural Networks 71 (2015), 1–10.

International Journal of Nonlinear Analysis and Applications

Volume 15, Issue 10
October 2024
Pages 143-155

Files

History

Receive Date: 04 August 2023
Revise Date: 31 August 2023
Accept Date: 20 September 2023

How to cite

Statistics

Article View: 910
PDF Download: 231

International Journal of Nonlinear Analysis and Applications

A new gated multi-scale convolutional neural network architecture for recognition of Persian handwritten texts

Volume 15, Issue 10October 2024Pages 143-155

Volume 15, Issue 10
October 2024
Pages 143-155