Detecting SQL Injection Web Attacks Using Ensemble Learners and Data Sampling

SQL Injection web attacks are a common choice among attackers to exploit web servers. We explore classification performance in detecting SQL Injection web attacks in the recent CSE-CIC-IDS2018 dataset with the Area Under the Receiver Operating Characteristic Curve (AUC) metric for the following seve...

Full description

Saved in:
Bibliographic Details
Published in:2021 IEEE International Conference on Cyber Security and Resilience (CSR) pp. 27 - 34
Main Authors: Zuech, Richard, Hancock, John, Khoshgoftaar, Taghi M.
Format: Conference Proceeding
Language:English
Published: IEEE 26-07-2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:SQL Injection web attacks are a common choice among attackers to exploit web servers. We explore classification performance in detecting SQL Injection web attacks in the recent CSE-CIC-IDS2018 dataset with the Area Under the Receiver Operating Characteristic Curve (AUC) metric for the following seven classifiers: Random Forest (RF), CatBoost (CB), LightGBM (LGB), XGBoost (XGB), Decision Tree (DT), Naive Bayes (NB), and Logistic Regression (LR) (with the first four learners being ensemble learners and for comparison, the last three being single learners). Our unique data preparation of CSE-CID- IDS2018 affords a harsh experimental testbed of class imbalance as encountered in the real world for cybersecurity attacks. To the best of our knowledge, we are the first to apply random undersampling techniques to web attacks from the CSE-CIC- IDS2018 dataset while exploring various sampling ratios. We find the ensemble learners to be the most effective at detecting SQL Injection web attacks, but only after first applying massive data sampling.
DOI:10.1109/CSR51186.2021.9527990