Directional Pairwise Class Confusion Bias and Its Mitigation

Recent advances in Natural Language Processing have led to powerful and sophisticated models like BERT (Bidirectional Encoder Representations from Transformers) that have bias. These models are mostly trained on text corpora that deviate in important ways from the text encountered by a chatbot in a...

Full description

Saved in:
Bibliographic Details
Published in:2022 IEEE 16th International Conference on Semantic Computing (ICSC) pp. 67 - 74
Main Authors: Sayenju, Sudhashree, Aygun, Ramazan, Boardman, Jonathan, Don, Duleep Prasanna Rathgamage, Zhang, Yifan, Franks, Bill, Johnston, Sereres, Lee, George, Sullivan, Dan, Modgil, Girish
Format: Conference Proceeding
Language:English
Published: IEEE 01-01-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recent advances in Natural Language Processing have led to powerful and sophisticated models like BERT (Bidirectional Encoder Representations from Transformers) that have bias. These models are mostly trained on text corpora that deviate in important ways from the text encountered by a chatbot in a problem-specific context. While a lot of research in the past has focused on measuring and mitigating bias with respect to protected attributes (stereotyping like gender, race, ethnicity, etc.), there is lack of research in model bias with respect to classification labels. We investigate whether a classification model hugely favors one class with respect to another. We introduce a bias evaluation method called directional pairwise class confusion bias that highlights the chatbot intent classification model's bias on pairs of classes. Finally, we also present two strategies to mitigate this bias using example biased pairs.
DOI:10.1109/ICSC52841.2022.00017