An Evaluation of Four Resampling Methods Used in Machine Learning Classification
This article investigates resampling methods used to evaluate the performance of machine learning classification algorithms. It compares four key resampling methods: 1) Monte Carlo resampling, 2) the Bootstrap Method, 3) k-fold Cross Validation, and 4) Repeated k-fold Cross Validation. Two classific...
Saved in:
Published in: | IEEE intelligent systems Vol. 36; no. 3; pp. 51 - 57 |
---|---|
Main Author: | |
Format: | Journal Article |
Language: | English |
Published: |
Los Alamitos
IEEE
01-05-2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This article investigates resampling methods used to evaluate the performance of machine learning classification algorithms. It compares four key resampling methods: 1) Monte Carlo resampling, 2) the Bootstrap Method, 3) k-fold Cross Validation, and 4) Repeated k-fold Cross Validation. Two classification algorithms, Support Vector Machines and Random Forests, applied to three datasets, are used in this article. Nine variations of the four resampling methods are used to tune parameters on the two classification algorithms on each of the three datasets. Performance is defined by how well the resampling method chooses a parameter value that fits the data well. A main finding is that Repeated k-fold Cross Validation, overall, outperforms the other resampling methods in selecting the best-fit parameter value across the three different datasets. |
---|---|
ISSN: | 1541-1672 1941-1294 |
DOI: | 10.1109/MIS.2020.2978066 |