Reducing Biases towards Minoritized Populations in Medical Curricular Content via Artificial Intelligence for Fairer Health Outcomes

Biased information (recently termed bisinformation) continues to be taught in medical curricula, often long after having been debunked. In this paper, we introduce BRICC, a firstin-class initiative that seeks to mitigate medical bisinformation using machine learning to systematically identify and fl...

Full description

Saved in:

Bibliographic Details
Main Authors:	Salavati, Chiman, Song, Shannon, Diaz, Willmar Sosa, Hale, Scott A, Montenegro, Roberto E, Murai, Fabricio, Dori-Hacohen, Shiri
Format:	Journal Article
Language:	English
Published:	21-05-2024
Subjects:	Computer Science - Computation and Language Computer Science - Computers and Society
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Biased information (recently termed bisinformation) continues to be taught in medical curricula, often long after having been debunked. In this paper, we introduce BRICC, a firstin-class initiative that seeks to mitigate medical bisinformation using machine learning to systematically identify and flag text with potential biases, for subsequent review in an expert-in-the-loop fashion, thus greatly accelerating an otherwise labor-intensive process. A gold-standard BRICC dataset was developed throughout several years, and contains over 12K pages of instructional materials. Medical experts meticulously annotated these documents for bias according to comprehensive coding guidelines, emphasizing gender, sex, age, geography, ethnicity, and race. Using this labeled dataset, we trained, validated, and tested medical bias classifiers. We test three classifier approaches: a binary type-specific classifier, a general bias classifier; an ensemble combining bias type-specific classifiers independently-trained; and a multitask learning (MTL) model tasked with predicting both general and type-specific biases. While MTL led to some improvement on race bias detection in terms of F1-score, it did not outperform binary classifiers trained specifically on each task. On general bias detection, the binary classifier achieves up to 0.923 of AUC, a 27.8% improvement over the baseline. This work lays the foundations for debiasing medical curricula by exploring a novel dataset and evaluating different training model strategies. Hence, it offers new pathways for more nuanced and effective mitigation of bisinformation.
DOI:	10.48550/arxiv.2407.12680