Efficient Prequential AUC-PR Computation

When dealing with classification problems for data streams, we often need to compute the classification metrics in a prequential manner. The Area Under the Precision-Recall Curve (AUC-PR) metric is extensively used in imbalanced classification scenarios, where the negative class outnumbers the posit...

Full description

Saved in:

Bibliographic Details
Published in:	2023 International Conference on Machine Learning and Applications (ICMLA) pp. 2222 - 2227
Main Authors:	Pereira Gomes, David L., Gregio, Andre, Zanata Alves, Marco A., Lisboa de Almeida, Paulo R.
Format:	Conference Proceeding
Language:	English
Published:	IEEE 15-12-2023
Subjects:	AUC-PR Binary search trees Classification algorithms Costs Data structures Focusing Machine learning algorithms Measurement metrics prequential stream
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	When dealing with classification problems for data streams, we often need to compute the classification metrics in a prequential manner. The Area Under the Precision-Recall Curve (AUC-PR) metric is extensively used in imbalanced classification scenarios, where the negative class outnumbers the positive one. Despite its advantages, it may be computationally expensive to recompute that metric every time a new test instance becomes available. In this work, we present an efficient algorithm to compute the AUC-PR in a prequential way. Our proposed algorithm uses a self-balancing binary search tree to avoid the need to reorder the data when updating the AUC-PR value with the most recent data. Our experiments take into consideration six well-known, publicly available stream-based datasets. Our experiments show that our approach can be up to 13 times faster and use 12 times less energy than the traditional batch approach when considering a window of size 1,000.
ISSN:	1946-0759
DOI:	10.1109/ICMLA58977.2023.00335