En-MinWhale: An Ensemble Approach Based on MRMR and Whale Optimization for Cancer Diagnosis

According to the WHO, Cancer is a prominent cause of mortality worldwide, accounting for ~10 million fatalities at the end of 2020. The most common types of cancers include Lung, Breast, CNS, Leukemia, Colon, and Cervical Cancer. Early detection of cancer can decrease the death toll. According to th...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access Vol. 11; pp. 113526 - 113542
Main Authors: Panigrahi, Amrutanshu, Pati, Abhilash, Sahu, Bibhuprasad, Das, Manmath Nath, Nayak, Debasish Swapnesh Kumar, Sahoo, Ghanashyam, Kant, Shashi
Format: Journal Article
Language:English
Published: Piscataway IEEE 2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:According to the WHO, Cancer is a prominent cause of mortality worldwide, accounting for ~10 million fatalities at the end of 2020. The most common types of cancers include Lung, Breast, CNS, Leukemia, Colon, and Cervical Cancer. Early detection of cancer can decrease the death toll. According to the study, if the cancer is identified at its early stage, the death rate can be reduced to ~85%. In order to reduce the death toll, machine learning (ML) emerges as a significant solution. When it comes to cancer research with ML, biopsy and microarray data come into the front. The biopsy data is less useful as it excludes a patient's genetic information. However, due to the genetic information, the microarray data emerges as a solution to detecting cancer disease. Dealing with microarray data also has some consequences, and high dimensionality is one of them. This article reports an ML-based ensemble model to tackle microarray data issues and provide an effective model for cancer detection. The reported model uses Minimum Redundancy Maximum Relevance (MRMR) as the feature selection algorithm. The Whale Optimization Algorithm (WOA) is implemented for the featured dataset to select the optimistic number of features without affecting microarray data relevance. Then, four ML-based classification models, including Support Vector Machine, Decision Tree, Multilayer Perceptron, and Random Forest, are applied as the base learners to make initial predictions. Finally, the Voting ensemble technique is applied to the initial prediction to develop an ensemble model for cancer prediction. The proposed En-MinWhale model is evaluated over six different types of cancer microarray datasets, including Lung, Leukemia, Ovarian, CNS, Breast, and Colon Cancer. Finally, the performance of the proposed model is evaluated using 11 various evaluative parameters, including accuracy, precision, specificity, sensitivity, F-<inline-formula> <tex-math notation="LaTeX">\beta </tex-math></inline-formula> score, etc. The En-MinWhale model shows 94.09%, 95.83%, 94.86%, 95.00%, 94.85%, and 96.77% accuracy for Lung, Leukemia, Ovarian, CNS, Breast, and Colon Cancer datasets, respectively, that outperforms other considered hybrid models and can help out the physicians in Cancer diagnosis.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3318261