Optimizing Vision Transformers for Histopathology: Pretraining and Normalization in Breast Cancer Classification

This paper introduces a self-attention Vision Transformer model specifically developed for classifying breast cancer in histology images. We examine various training strategies and configurations, including pretraining, dimension resizing, data augmentation and color normalization strategies, patch...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of imaging Vol. 10; no. 5; p. 108
Main Authors:	Baroni, Giulia Lucrezia, Rasotto, Laura, Roitero, Kevin, Tulisso, Angelica, Di Loreto, Carla, Della Mea, Vincenzo
Format:	Journal Article
Language:	English
Published:	Switzerland MDPI AG 01-05-2024
Subjects:	Accuracy Artificial intelligence Automation Breast cancer Classification Color Comparative analysis Configurations Data augmentation Datasets Deep learning Effectiveness Histology Histology, Pathological Histopathology Identification and classification Image classification Machine vision Medical imaging Neural networks normalization Prostate Technology application transformers Italy deep learning breast cancer transformers histology normalization
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper introduces a self-attention Vision Transformer model specifically developed for classifying breast cancer in histology images. We examine various training strategies and configurations, including pretraining, dimension resizing, data augmentation and color normalization strategies, patch overlap, and patch size configurations, in order to evaluate their impact on the effectiveness of the histology image classification. Additionally, we provide evidence for the increase in effectiveness gathered through geometric and color data augmentation techniques. We primarily utilize the BACH dataset to train and validate our methods and models, but we also test them on two additional datasets, BRACS and AIDPATH, to verify their generalization capabilities. Our model, developed from a transformer pretrained on ImageNet, achieves an accuracy rate of 0.91 on the BACH dataset, 0.74 on the BRACS dataset, and 0.92 on the AIDPATH dataset. Using a model based on the prostate small and prostate medium HistoEncoder models, we achieve accuracy rates of 0.89 and 0.86, respectively. Our results suggest that pretraining on large-scale general datasets like ImageNet is advantageous. We also show the potential benefits of using domain-specific pretraining datasets, such as extensive histopathological image collections as in HistoEncoder, though not yet with clear advantages.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2313-433X 2313-433X
DOI:	10.3390/jimaging10050108