Analysis of Flight Delay Data Using Different Machine Learning Algorithms

Accurate prediction of flights arrival remains a challenge due to dynamic environments. On predominantly challenging days, unforeseen peaks in flight volumes can stretch operational capacity and adversely distress the service levels pro-vided. Anomaly detection is a growing field of study with a var...

Full description

Saved in:
Bibliographic Details
Published in:2022 New Trends in Civil Aviation (NTCA) pp. 57 - 62
Main Authors: S., Blessy Trencia Lincy S., Al Ali, Hannah, Majid, Ahmad Abdulla Abdulaziz Mohd, Alhammadi, Omeer Arif Abdelbaqi Abdalla, Aljassmy, Aysha Momen Yousuf Mohammed, Mukandavire, Zindoga
Format: Conference Proceeding
Language:English
Published: Czech Technical University in Prague 26-10-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate prediction of flights arrival remains a challenge due to dynamic environments. On predominantly challenging days, unforeseen peaks in flight volumes can stretch operational capacity and adversely distress the service levels pro-vided. Anomaly detection is a growing field of study with a variety of approaches and applications. We performed exploratory data analysis to establish factors that determine flight delays using the UK aviation operation data on flight volumes from 2015 to 2020. Further, we applied different machine learning and data analysis approaches to detect anomalies in the data. We adopted both supervised and unsupervised algorithms applicable to the data set because of their strength in analyzing the aviation operational data which is usually unlabeled. We applied the K means clustering algorithm, K-Nearest Neighbors (KNN) algorithm, support vector machine (SVM) and XGBoost algorithms with the data to determine the most efficient model in predicting the delay in the flight. Our results suggest that average time delays have generally decreased over the years in the UK. The XGBoost algorithm and SVM algorithm performed well with the training data but yielded poor performance with the test data. On the other hand, the KNN algorithm gave an accuracy of 75.47 % and 76.31 % for training and test data respectively. Hence, these results suggest that KNN algorithm will perform better even for new unseen data and is more suitable algorithm than the SVM and XGBoost in predicting the flight delay.
ISSN:2694-7854
DOI:10.23919/NTCA55899.2022.9934398