Document clustering analysis based on hybrid cuckoo search and K-means algorithm
 No Thumbnail Available 
Date
2021
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
The clustering is an interesting technique for
unsupervised document organization in the World Wide Web
(WWW). The most widely used partitioning clustering
algorithm is K-means. However, it has an issue with random
initialization, which might lead to local optimum situations. In
fact, metaheuristics-based clustering has demonstrated their
efficiency to reach a global solution instead of local one. The
Cuckoo search (CS) has been widely used for the clustering
problem. However, the number of iterations grows dramatically
when the dataset is high dimensional like the documents.
In this study, the hybridization cuckoo search and K-means
algorithms for the document clustering are analyzed. So, three
hybrid algorithms are investigated and compared. The
performance and the efficiency of the proposed algorithms are
evaluated using Reuters 21578 Text Categorization Benchmark
Dataset. The obtained results show the capability of the new
approaches to generate more compact clustering and enhancing
purity and F-measure clustering qualities
Description
Keywords
Cuckoo Search, K-means, Document Clustering, Optimization, Metaheuristic, F-measure, Purity, Vector Space
