Optimizing Gene Selection for Cancer Classification Using Mutual Information and the Social Spider Algorithm
No Thumbnail Available
Date
2025
Authors
Cherif, Chahira
Maiza, Mohammed
Chouraqui, Samira
Taleb, A.
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
Cancer remains a major global health challenge, underscoring the need for advanced methods to enable early and accurate diagnosis. While microarray technology facilitates high-throughput gene expression profiling, the high dimensionality of the data poses challenges for effective cancer classification. To address this limitation, we propose a novel hybrid approach combining the Social Spider Optimization (SSO) algorithm with Mutual Information (MI)-based feature selection techniques-including Mutual Information Maximization (MIM), Joint Mutual Information (JMI), and Max-Relevance MinRedundancy (MRMR)-to identify the most discriminative genes. We evaluate four machine learning classifiers-Random Forest (RF), XGBoost (XGB), Neural Networks (NN), and Support Vector Machines (SVM)-with and without feature selection. Our results demonstrate that SSO-enhanced feature selection significantly improves classification accuracy, with SVM paired with MRMR achieving near-perfect performance on leukemia and lymphoma datasets. Moreover, MIM and JMI exhibit competitive performance in reducing data redundancy and enhancing computational efficiency. The proposed method effectively optimizes feature selection, providing a robust framework for improved cancer diagnosis
Description
Keywords
Support vector machines, Visualization, Classification algorithms
