Document Type : Research Paper
Authors
1 Department of Information and Communication Technology Management, Qeshm Branch, Islamic Azad University, Qeshm, Iran
2 Department of Industrial Management, Tehran Branch, Islamic Azad University, Tehran, Iran
3 Department of Industrial Management, Science and Research Branch, Islamic Azad University, Tehran, Iran
Abstract
Due to the spread of the Internet and its pervasiveness, ``big data" is created daily. Processing this amount of data requires a system with high processing power. In fact, the production and collection of data from a wide range of different equipment and tools lead to the creation of large-scale databases. In dealing with large and unstructured databases and their management, there are always challenges. This study aims to present a model to increase the clustering accuracy of big data using a fuzzy clustering system based on data mining in a MatLab programming environment. For this purpose, first, the importance of each variable in the decision tree models in SPSSModeler software is determined, then with the help of these results, fuzzy rules are explained and a fuzzy inference system is formed in MATLAB software. This study uses data mining techniques such as C\&R Tree, Chaid and C5.0 to study the development of the FCM method to increase clustering accuracy in high volume data and related factors such as data preparation indicators, data type Data quality, data dimensions, data volume and number of clusters were evaluated as inputs and clustering accuracy index was evaluated as output. Then, with the help of these results, the rules of forming a fuzzy inference system were determined and by explaining the membership functions of the decision model, it showed what effect each input index has on the output index.
Keywords