Optimizing Gene Selection for Cancer Classification Using Mutual Information and the Social Spider Algorithm
No Thumbnail Available
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
Cancer remains a major global health challenge, underscoring the need for advanced methods to enable early and accurate diagnosis. While microarray technology facilitates high-throughput gene expression profiling, the high dimensionality of the data poses challenges for effective cancer classification. To address this limitation, we propose a novel hybrid approach combining the Social Spider Optimization (SSO) algorithm with Mutual Information (MI)-based feature selection techniques-including Mutual Information Maximization (MIM), Joint Mutual Information (JMI), and Max-Relevance MinRedundancy (MRMR)-to identify the most discriminative genes. We evaluate four machine learning classifiers-Random Forest (RF), XGBoost (XGB), Neural Networks (NN), and Support Vector Machines (SVM)-with and without feature selection. Our results demonstrate that SSO-enhanced feature selection significantly improves classification accuracy, with SVM paired with MRMR achieving near-perfect performance on leukemia and lymphoma datasets. Moreover, MIM and JMI exhibit competitive performance in reducing data redundancy and enhancing computational efficiency. The proposed method effectively optimizes feature selection, providing a robust framework for improved cancer diagnosis
Description
Keywords
Support vector machines, Visualization, Classification algorithms
