Document Type : Research Paper
Authors
1 Department of Information Technology, Kakatiya Institute of technology and science, Warangal- 506015, Telangana
2 Department of Computer Science and Engineering, Siddhartha Institute of Technology and Sciences, Narapally, Hyderabad, Telangana 500088
3 Department of Computer Science and Engineering, PSNA College of Engineering and Technology Kothandaraman Nagar, Dindigul-624622 TamilNadu, India.
Abstract
The education is very important for improving the values of students in the society. Different types of features like school related features, student related features, parent related features and teacher related features are influencing the success rate of students in their education. Identification of best features from the huge set of features for analyzing the success or failure of a student is one important challenge to the research community and academicians. The set of features information is collected for preparing the student dataset also one difficult task in the prediction of student academic performance. We collected a student dataset of different schools that contains 4965 student’s information. The dataset contains information of 45 features of different categories such as school related features, student related features, parent related features and teacher related features. All features are not useful for predicting the academic performance of a student. The Data mining methods are applied in various research domains including education to extract hidden information from datasets. The feature selection algorithms are used to determine the best informative features by eliminating the irrelevant and redundant features. In this work, Relief-F Budget Tree Random Forest feature selection algorithm is used to identify the relevant features in the collected school dataset. Five different machine learning models are used to predict the efficiency of feature selection algorithm. The decision tree model shows best accuracy for student academic performance prediction compared with other models. The experimental results display that the RFBTRF algorithm identifies the best informative features for enhancing the accuracy of student academic performance prediction and also reduces the over-fitting issues. The experiment started with individual features and then continued with combination of different categories of features. It was observed that the accuracy of student academic performance prediction is decreased when some categories of features are added to other categories of features.
Keywords