Designing a Persian question answering system based on rhetorical structure theory

Document Type : Research Paper

Authors

1 Department of Information Technology Management, Faculty of Management, Science and Research Branch, Islamic Azad University, Tehran, Iran

2 Faculty of Management, Central Tehran Branch, Islamic Azad University, Tehran, Iran

3 Department of Computer Engineering, Faculty of Computer, Saveh Branch, Islamic Azad University, Tehran, Iran

Abstract

A question answering system answers questions using natural language processing, a database, or a document set and returns an accurate answer to the user’s question. A large number of efforts have been made to design some systems to answer the user’s question. However, limited studies have been conducted on the Persian language to extract the answer to the questions with subjects “why” or “how”. The scarcity of such studies is attributed to the complexity and time-consuming analysis and processing of the text structure when going beyond the boundaries of a sentence.
The present study’s primary purpose was to analyze Persian text to create a set of linguistic patterns that can perform related information of causal/explanatory text sentences in a general domain. Information retrieval and text structure recognition algorithms were used for data and text analysis, called Rhetorical structure theory. In addition, 70 questions for “why” and 20 questions for “how” were determined for evaluating the system performance, respectively. Finally, the .NET programming language and relational database, and Persian language interpreters were used to design the software system.
Eventually, a system was designed and published to answer the question with subjects “why” or “how” with general Data Domain.
The system answered 61 questions with a recall rate of 68%. About 55% of the items were correctly responded to according to the signs of inter-sentence relation, while the correct answers to 13% of questions were related to rhetorical relation among the sentences.

Keywords