Analysis of Flight Delay Data Using Different Machine Learning Algorithms
Accurate prediction of flights arrival remains a challenge due to dynamic environments. On predominantly challenging days, unforeseen peaks in flight volumes can stretch operational capacity and adversely distress the service levels pro-vided. Anomaly detection is a growing field of study with a var...
Saved in:
Published in: | 2022 New Trends in Civil Aviation (NTCA) pp. 57 - 62 |
---|---|
Main Authors: | , , , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
Czech Technical University in Prague
26-10-2022
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Accurate prediction of flights arrival remains a challenge due to dynamic environments. On predominantly challenging days, unforeseen peaks in flight volumes can stretch operational capacity and adversely distress the service levels pro-vided. Anomaly detection is a growing field of study with a variety of approaches and applications. We performed exploratory data analysis to establish factors that determine flight delays using the UK aviation operation data on flight volumes from 2015 to 2020. Further, we applied different machine learning and data analysis approaches to detect anomalies in the data. We adopted both supervised and unsupervised algorithms applicable to the data set because of their strength in analyzing the aviation operational data which is usually unlabeled. We applied the K means clustering algorithm, K-Nearest Neighbors (KNN) algorithm, support vector machine (SVM) and XGBoost algorithms with the data to determine the most efficient model in predicting the delay in the flight. Our results suggest that average time delays have generally decreased over the years in the UK. The XGBoost algorithm and SVM algorithm performed well with the training data but yielded poor performance with the test data. On the other hand, the KNN algorithm gave an accuracy of 75.47 % and 76.31 % for training and test data respectively. Hence, these results suggest that KNN algorithm will perform better even for new unseen data and is more suitable algorithm than the SVM and XGBoost in predicting the flight delay. |
---|---|
ISSN: | 2694-7854 |
DOI: | 10.23919/NTCA55899.2022.9934398 |