An Evaluation of Four Resampling Methods Used in Machine Learning Classification

This article investigates resampling methods used to evaluate the performance of machine learning classification algorithms. It compares four key resampling methods: 1) Monte Carlo resampling, 2) the Bootstrap Method, 3) k-fold Cross Validation, and 4) Repeated k-fold Cross Validation. Two classific...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE intelligent systems Vol. 36; no. 3; pp. 51 - 57
Main Author:	Nakatsu, Robbie T.
Format:	Journal Article
Language:	English
Published:	Los Alamitos IEEE 01-05-2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms Anomaly detection Classification Classification algorithms Datasets hootstrap hyperparameter tuning k-fold cross validation Machine learning modeling and prediction Monte Carlo methods Monte Carlo simulation Parameters Performance evaluation Predictive models Resampling simulation Statistical methods Support vector machines Training data validation
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This article investigates resampling methods used to evaluate the performance of machine learning classification algorithms. It compares four key resampling methods: 1) Monte Carlo resampling, 2) the Bootstrap Method, 3) k-fold Cross Validation, and 4) Repeated k-fold Cross Validation. Two classification algorithms, Support Vector Machines and Random Forests, applied to three datasets, are used in this article. Nine variations of the four resampling methods are used to tune parameters on the two classification algorithms on each of the three datasets. Performance is defined by how well the resampling method chooses a parameter value that fits the data well. A main finding is that Repeated k-fold Cross Validation, overall, outperforms the other resampling methods in selecting the best-fit parameter value across the three different datasets.
ISSN:	1541-1672 1941-1294
DOI:	10.1109/MIS.2020.2978066