Optimizing Gene Selection for Cancer Classification Using Mutual Information and the Social Spider Algorithm

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

Cancer remains a major global health challenge, underscoring the need for advanced methods to enable early and accurate diagnosis. While microarray technology facilitates high-throughput gene expression profiling, the high dimensionality of the data poses challenges for effective cancer classification. To address this limitation, we propose a novel hybrid approach combining the Social Spider Optimization (SSO) algorithm with Mutual Information (MI)-based feature selection techniques-including Mutual Information Maximization (MIM), Joint Mutual Information (JMI), and Max-Relevance MinRedundancy (MRMR)-to identify the most discriminative genes. We evaluate four machine learning classifiers-Random Forest (RF), XGBoost (XGB), Neural Networks (NN), and Support Vector Machines (SVM)-with and without feature selection. Our results demonstrate that SSO-enhanced feature selection significantly improves classification accuracy, with SVM paired with MRMR achieving near-perfect performance on leukemia and lymphoma datasets. Moreover, MIM and JMI exhibit competitive performance in reducing data redundancy and enhancing computational efficiency. The proposed method effectively optimizes feature selection, providing a robust framework for improved cancer diagnosis

Description

Keywords

Support vector machines, Visualization, Classification algorithms

Citation

Endorsement

Review

Supplemented By

Referenced By