Evaluation of deep learning techniques for identification of sarcoma-causing carcinogenic mutations

The abnormal growth of human healthy cells is called cancer. One of the major types of cancer is sarcoma, mostly found in human bones and soft tissue cells. It commonly occurs in children. According to a survey of the United States of America, there are more than 17,000 sarcoma patients registered e...

Full description

Saved in:

Bibliographic Details
Published in:	Digital health Vol. 8; p. 20552076221133703
Main Authors:	Shah, Asghar Ali, Alturise, Fahad, Alkhalifah, Tamim, Khan, Yaser Daanial
Format:	Journal Article
Language:	English
Published:	London, England SAGE Publications 2022 Sage Publications Ltd SAGE Publishing
Subjects:	Algorithms Cancer Deep learning Genes Mutation Original Research Sarcoma gated recurrent units and bi-directional LSTM self-consistency test receiver operating characteristic (ROC) curve 10-fold cross-validation test independent set test Long and short-term memory (LSTM) network sarcoma cancer
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The abnormal growth of human healthy cells is called cancer. One of the major types of cancer is sarcoma, mostly found in human bones and soft tissue cells. It commonly occurs in children. According to a survey of the United States of America, there are more than 17,000 sarcoma patients registered each year which is 15% of all cancer cases. Recognition of cancer at its early stage saves many lives. The proposed study developed a framework for the early detection of human sarcoma cancer using deep learning Recurrent Neural Network (RNN) algorithms. The DNA of a human cell is made up of 25,000 to 30,000 genes. Each gene is represented by sequences of nucleotides. The nucleotides in a sequence of a driver gene can change which is termed as mutations. Some mutations can cause cancer. There are seven types of a gene whose mutation causes sarcoma cancer. The study uses the dataset which has been taken from more than 134 samples and includes 141 mutations in 8 driver genes. On these gene sequences RNN algorithms Long and Short-Term Memory (LSTM), Gated Recurrent Units and Bi-directional LSTM (Bi-LSTM) are used for training. Rigorous testing techniques such as Self-consistency testing, independent set testing, 10-fold cross-validation test are applied for the validation of results. These validation techniques yield several metrics such as Area Under the Curve (AUC), sensitivity, specificity, Mathew's correlation coefficient, loss, and accuracy. The proposed algorithm exhibits an accuracy of 99.6% with an AUC value of 1.00.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2055-2076 2055-2076
DOI:	10.1177/20552076221133703