Man against machine reloaded: performance of a market-approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions

Convolutional neural networks (CNNs) efficiently differentiate skin lesions by image analysis. Studies comparing a market-approved CNN in a broad range of diagnoses to dermatologists working under less artificial conditions are lacking. One hundred cases of pigmented/non-pigmented skin cancers and b...

Full description

Saved in:
Bibliographic Details
Published in:Annals of oncology Vol. 31; no. 1; pp. 137 - 143
Main Authors: Haenssle, H.A., Fink, C., Toberer, F., Winkler, J., Stolz, W., Deinlein, T., Hofmann-Wellenhof, R., Lallas, A., Emmert, S., Buhl, T., Zutt, M., Blum, A., Abassi, M.S., Thomas, L., Tromme, I., Tschandl, P., Enk, A., Rosenberger, A., Alt, Christina, Bachelerie, Marie, Bajaj, Sonali, Balcere, Alise, Baricault, Sophie, Barthaux, Clément, Beckenbauer, Yvonne, Bertlich, Ines, Blum, Andreas, Bouthenet, Marie-France, Brassat, Sophie, Marcel Buck, Philipp, Buder-Bakhaya, Kristina, Cappelletti, Maria-Letizia, Chabbert, Cécile, De Labarthe, Julie, DeCoster, Eveline, Deinlein, Teresa, Dobler, Michèle, Dumon, Daphnée, Emmert, Steffen, Gachon-Buffet, Julie, Gusarov, Mikhail, Hartmann, Franziska, Hartmann, Julia, Herrmann, Anke, Hoorens, Isabelle, Hulstaert, Eva, Karls, Raimonds, Kolonte, Andreea, Kromer, Christian, Lallas, Aimilios, Le Blanc Vasseux, Céline, Levy-Roy, Annabelle, Majenka, Pawel, Marc, Marine, Bourret, Veronique Martin, Michelet-Brunacci, Nadège, Mitteldorf, Christina, Paroissien, Jean, Picard, Camille, Plise, Diana, Reymann, Valérie, Ribeaudeau, Fabrice, Richez, Pauline, Roche Plaine, Hélène, Salik, Deborah, Sattler, Elke, Schäfer, Sarah, Schneiderbauer, Roland, Secchi, Thierry, Talour, Karen, Trennheuser, Lukas, Wald, Alexander, Wölbing, Priscila, Zukervar, Pascale
Format: Journal Article
Language:English
Published: England Elsevier Ltd 01-01-2020
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Convolutional neural networks (CNNs) efficiently differentiate skin lesions by image analysis. Studies comparing a market-approved CNN in a broad range of diagnoses to dermatologists working under less artificial conditions are lacking. One hundred cases of pigmented/non-pigmented skin cancers and benign lesions were used for a two-level reader study in 96 dermatologists (level I: dermoscopy only; level II: clinical close-up images, dermoscopy, and textual information). Additionally, dermoscopic images were classified by a CNN approved for the European market as a medical device (Moleanalyzer Pro, FotoFinder Systems, Bad Birnbach, Germany). Primary endpoints were the sensitivity and specificity of the CNN’s dichotomous classification in comparison with the dermatologists’ management decisions. Secondary endpoints included the dermatologists’ diagnostic decisions, their performance according to their level of experience, and the CNN’s area under the curve (AUC) of receiver operating characteristics (ROC). The CNN revealed a sensitivity, specificity, and ROC AUC with corresponding 95% confidence intervals (CI) of 95.0% (95% CI 83.5% to 98.6%), 76.7% (95% CI 64.6% to 85.6%), and 0.918 (95% CI 0.866–0.970), respectively. In level I, the dermatologists’ management decisions showed a mean sensitivity and specificity of 89.0% (95% CI 87.4% to 90.6%) and 80.7% (95% CI 78.8% to 82.6%). With level II information, the sensitivity significantly improved to 94.1% (95% CI 93.1% to 95.1%; P < 0.001), while the specificity remained unchanged at 80.4% (95% CI 78.4% to 82.4%; P = 0.97). When fixing the CNN’s specificity at the mean specificity of the dermatologists’ management decision in level II (80.4%), the CNN’s sensitivity was almost equal to that of human raters, at 95% (95% CI 83.5% to 98.6%) versus 94.1% (95% CI 93.1% to 95.1%); P = 0.1. In contrast, dermatologists were outperformed by the CNN in their level I management decisions and level I and II diagnostic decisions. More experienced dermatologists frequently surpassed the CNN’s performance. Under less artificial conditions and in a broader spectrum of diagnoses, the CNN and most dermatologists performed on the same level. Dermatologists are trained to integrate information from a range of sources rendering comparative studies that are solely based on one single case image inadequate. •A market-approved convolutional neural network (CNN) trained on dermoscopic images was tested against 96 dermatologists.•Test data included a broad range of skin lesions and was compiled from external sources not involved in CNN training.•Dermatologists indicated their management decisions after reviewing clinical, dermoscopic, and textual case information.•In this setting dermatologists performed on par with the CNN's classifications based on dermoscopic images alone.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0923-7534
1569-8041
DOI:10.1016/j.annonc.2019.10.013