Interpretable survival prediction for colorectal cancer using deep learning

Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease-specific survival for stage II and III colorectal cancer using 3652 cases (27,300 slides). When ev...

Full description

Saved in:

Bibliographic Details
Published in:	NPJ digital medicine Vol. 4; no. 1; p. 71
Main Authors:	Wulczyn, Ellery, Steiner, David F., Moran, Melissa, Plass, Markus, Reihs, Robert, Tan, Fraser, Flament-Auvigne, Isabelle, Brown, Trissia, Regitnig, Peter, Chen, Po-Hsuan Cameron, Hegde, Narayan, Sadhwani, Apaar, MacDonald, Robert, Ayalew, Benny, Corrado, Greg S., Peng, Lily H., Tse, Daniel, Müller, Heimo, Xu, Zhaoyang, Liu, Yun, Stumpe, Martin C., Zatloukal, Kurt, Mermel, Craig H.
Format:	Journal Article
Language:	English
Published:	London Nature Publishing Group UK 19-04-2021 Nature Publishing Group Nature Portfolio
Subjects:	639/705/117 692/308/53/2422 692/699/67/1504/1885/1393 Biomedicine Biotechnology Colorectal cancer Deep learning Digital technology Health informatics Medical prognosis Medicine Medicine & Public Health
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease-specific survival for stage II and III colorectal cancer using 3652 cases (27,300 slides). When evaluated on two validation datasets containing 1239 cases (9340 slides) and 738 cases (7140 slides), respectively, the DLS achieved a 5-year disease-specific survival AUC of 0.70 (95% CI: 0.66–0.73) and 0.69 (95% CI: 0.64–0.72), and added significant predictive value to a set of nine clinicopathologic features. To interpret the DLS, we explored the ability of different human-interpretable features to explain the variance in DLS scores. We observed that clinicopathologic features such as T-category, N-category, and grade explained a small fraction of the variance in DLS scores ( R 2 = 18% in both validation sets). Next, we generated human-interpretable histologic features by clustering embeddings from a deep-learning-based image-similarity model and showed that they explained the majority of the variance ( R 2 of 73–80%). Furthermore, the clustering-derived feature most strongly associated with high DLS scores was also highly prognostic in isolation. With a distinct visual appearance (poorly differentiated tumor cell clusters adjacent to adipose tissue), this feature was identified by annotators with 87.0–95.5% accuracy. Our approach can be used to explain predictions from a prognostic deep learning model and uncover potentially-novel prognostic features that can be reliably identified by people for future validation studies.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2398-6352 2398-6352
DOI:	10.1038/s41746-021-00427-2