Clustering ensemble selection: A systematic mapping study

Document Type : Review articles

Authors

1 Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran

2 Department of Applied Mathematics, Sari Branch, Islamic Azad University, Sari, Iran

Abstract

Clustering has emerged as an important tool for data analysis, which can be used to produce high-quality data partitions as well as stronger and more accurate consensus clustering based on basic clustering. Data item labels, which are already known as opposed to classification issues, are unlabeled clusters in unsupervised clustering, which may cause uncertainty in large libraries. Therefore, all clusters produced are not useful for the final clustering solution. To address this challenge, instead of selecting all of them from a subset of variants to combine for the obtainment of the final result, Clustering ensemble selection (CES) was proposed in 2006 by Hadjitodorov. The goal is the selection of a subset of large libraries to produce a smaller cluster offering higher-quality performance. (CES) has been found effective in the improvement of the clustering solutions quality. The current paper conducts a systematic mapping study (SMS) for the analysis and synthetization of the studies formerly conducted on the CES techniques. To this end, 42 prominent publications from the existing literature, published from 2006 to August 2022, were selected to be examined in this article. The analysis results showed that most of the articles have used the NMI measure to evaluate the cluster quality, and the method of valuing the initial parameter has been more commonly used for the generation of diversity. Clustering ensemble selection has not been done on text yet; in addition, the trade-off between diversity and quality (considering both at the same time) can be studied and evaluated in the future.

Keywords


Articles in Press, Corrected Proof
Available Online from 11 February 2023
  • Receive Date: 02 October 2022
  • Revise Date: 09 December 2022
  • Accept Date: 19 December 2022
  • First Publish Date: 11 February 2023