Performance analysis of four machine learning algorithms for the accurate prediction of metastatic disease in cutaneous squamous cell carcinoma

e13579 Background: Cutaneous squamous cell carcinoma (cSCC) are the most common form of metastasising skin cancer. Whilst rates of metastatic cSCC are low, they account for a significant proportion of skin cancer related morbidity and mortality, particularly within elderly cohorts, which poses a sig...

Full description

Saved in:
Bibliographic Details
Published in:Journal of clinical oncology Vol. 41; no. 16_suppl; p. e13579
Main Authors: Andrew, Tom William, Bolnykh, Iakov, Bowes, Amy Louise, Fernando, Suhari Arahliya Serendibsha Grace, Nair, Ashvati, Martin, Sabrina Noor Pia, Maan, Balraj, Sloan, Philip, Lovat, Penny, Rose, Aidan
Format: Journal Article
Language:English
Published: 01-06-2023
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:e13579 Background: Cutaneous squamous cell carcinoma (cSCC) are the most common form of metastasising skin cancer. Whilst rates of metastatic cSCC are low, they account for a significant proportion of skin cancer related morbidity and mortality, particularly within elderly cohorts, which poses a significant burden to healthcare services. Established cSCC tumour staging systems perform poorly at predicting metastatic risk. Additionally, we lack clinically validated prognostic biomarkers – highlighting the unmet need for novel risk stratification tools to guide clinical practice and improve outcomes for patients with advanced disease. We aimed to train four recognised machine learning (ML) algorithms on a large clinic-pathological dataset of primary cSCC, with the objective of optimising an ML strategy and developing a reliable and clinically useful risk stratification tool capable of accurately predicting metastatic events following primary cSCC. Methods: A dataset of primary cSCC registrations was derived from the Northern Cancer Registry, UK. This identified 7003 histologically confirmed primary cSCC registered between 2010–2020; providing a minimum of 2 years clinical follow-up. We conducted a retrospective analysis of standardised pathology datasets, recording clinical-pathological features. Primary outcome measure was regional and/or distant metastasis. Four machine learning algorithms, were trained based on these features, including: a Logistic Regression Trainer, a Decision Tree Classifier, a Random Forest Classifier and a fully connected artificial neural network (ANN). The algorithms were optimised on training data using five-fold cross validation. Subgroup analysis was performed using mean Shapley additive explanations (SHAP). Results: Accuracy scoring identified the ANN as the optimal predictor of metastasis (0.94), followed by: Logistic Regression Trainer (0.82), Random Forest Classifier (0.80), and Decision Tree Classifier (0.71). Preliminary subgroup analysis identified immunosuppression as most sensitive risk factor for developing metastatic disease (SHAP = 0.122). Conclusions: Significant heterogeneity in current morbidity and mortality data has limited the capacity of traditional statistical models and tumour staging systems to identify very high-risk cSSC. Our findings demonstrate that ML algorithms can accurately predict metastatic events in cSSC populations. Further development of a model user-interface is necessary to support the development of a useful risk stratification tool to guide clinical practice.
ISSN:0732-183X
1527-7755
DOI:10.1200/JCO.2023.41.16_suppl.e13579