|dc.description.abstract||This dissertation develops machine-learning models to predict the airline on-time performance using different types of variables. The second chapter details a methodology developed to improve performance of the model and improve the prediction of a minority class when working with an imbalanced dataset. The third chapter develops a machine learning framework to classify whether the flight is on-time or delayed using factors/variables under the supervision of an airline company’s decision maker. With the application of the methodology developed in the second chapter, the proposed framework obtains better results than those in the literature. Chapter 3 uses data from Federal Aviation Administration from January 2015 to December 2017 across all US airports for domestic flights. Finally, in the fourth chapter, the impact of weather variables, aircraft variables, airport variables and scheduling variables that were not considered before, on predicting the delays are studied. Furthermore, this model developed in the fourth chapter also provides an airline company decision maker with answers to three questions: is the flight delayed, if delayed what is the cause of the delay and by how much time was it delayed. The study uses three different types of classifications, i.e., binary classification, multi output classification and multi label classification to answer the three questions in the respective order. There are multiple goals of the dissertation:
1. Develop a framework to get better performance from imbalanced datasets using different machine learning algorithms.
2. To predict airline delays to help airline companies using different machine learning algorithm.
3. If the flight is delayed, to predict the reasons of delay and the impact of delay in terms of time.||