Attribute Selection, Outliers Impact Study and Visualization within Breast Cancer Detection

Classification of mammography data into two types of breast tumors, benign or malignant, is an effective screening tool and the primary way of diagnosis and decision-making. This report aims to opt for the most relevant attributes of the well-known Wisconsin Breast Cancer Diagnostic Data Set to redu...

Full description

Saved in:
Bibliographic Details
Published in:2023 IEEE 13th International Conference on Electronics and Information Technologies (ELIT) pp. 1 - 5
Main Authors: Chuiko, Gennady, Dvornik, Olga, Darnapuk, Yeugen, Honcharov, Denis, Krainyk, Yaroslav, Yaremchuk, Olga
Format: Conference Proceeding
Language:English
Published: IEEE 26-09-2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Classification of mammography data into two types of breast tumors, benign or malignant, is an effective screening tool and the primary way of diagnosis and decision-making. This report aims to opt for the most relevant attributes of the well-known Wisconsin Breast Cancer Diagnostic Data Set to reduce its size at first. The reduction was performed initially by ranking attributes and finally by "decision tree" analysis. The clipped data set had only six attributes, against 31 in the initial one. The five most relevant attributes were the following: "perimeter_worst," "area_worst," "concave points_worst," "texture_mean," and "concave points_mean." If possible, It should be done without losing classification potency. Over and above, our extra goal was to find classifiers that provide acceptable performance while allowing visualization of the results in a way accessible to clinicians. Here, we mean various visualization tools in the Machine learning framework: "decision trees," association rules, attribute ranking, and so forth, to improve breast cancer diagnosis.
DOI:10.1109/ELIT61488.2023.10310922