Multiomics-Based Feature Extraction and Selection for the Prediction of Lung Cancer Survival

Lung cancer is a global health challenge, hindered by delayed diagnosis and the disease's complex molecular landscape. Accurate patient survival prediction is critical, motivating the exploration of various -omics datasets using machine learning methods. Leveraging multi-omics data, this study...

Full description

Saved in:
Bibliographic Details
Published in:International journal of molecular sciences Vol. 25; no. 7; p. 3661
Main Authors: Jaksik, Roman, Szumała, Kamila, Dinh, Khanh Ngoc, Śmieja, Jarosław
Format: Journal Article
Language:English
Published: Switzerland MDPI AG 01-04-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Lung cancer is a global health challenge, hindered by delayed diagnosis and the disease's complex molecular landscape. Accurate patient survival prediction is critical, motivating the exploration of various -omics datasets using machine learning methods. Leveraging multi-omics data, this study seeks to enhance the accuracy of survival prediction by proposing new feature extraction techniques combined with unbiased feature selection. Two lung adenocarcinoma multi-omics datasets, originating from the TCGA and CPTAC-3 projects, were employed for this purpose, emphasizing gene expression, methylation, and mutations as the most relevant data sources that provide features for the survival prediction models. Additionally, gene set aggregation was shown to be the most effective feature extraction method for mutation and copy number variation data. Using the TCGA dataset, we identified 32 molecular features that allowed the construction of a 2-year survival prediction model with an AUC of 0.839. The selected features were additionally tested on an independent CPTAC-3 dataset, achieving an AUC of 0.815 in nested cross-validation, which confirmed the robustness of the identified features.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1422-0067
1661-6596
1422-0067
DOI:10.3390/ijms25073661