The application of machine learning algorithms for flood susceptibility assessment for the state of Kansas
Authors
Advisors
Issue Date
Type
Keywords
Citation
Abstract
Flooding has been a significant problem in the United States (US) over the past century. Since 1996, more than 1,500 flood events have been recorded in Kansas, resulting in more than billions in losses. This project explored the use of machine learning and publicly available data to assess factors affecting flooding and develop a flood susceptibility map for Kansas at multiple resolutions. It aims to explore the major predictor variables or flood-controlling factors and the response of the Stack Generalization across multiple resolutions and scenarios. Six machine learning (ML) algorithms: Logistic Regression (LR); Random Forest (RF); Support Vector Machine (SVM); K-nearest neighbor (KNN); Adaptive Boosting (Ada Boost); Extreme Gradient Boosting (XG Boost) were employed to determine the most important factors influencing the susceptibility of an area to flooding. The learning set for the ML algorithms comprised geospatial datasets of thirteen flood-controlling factors: rainfall, elevation, slope, aspect, flow direction, flow accumulation, Topographic Wetness Index (TWI), distance from the nearest stream, evapotranspiration, land cover, impervious surface, land surface temperature, and hydrologic soil type. A total of 1,528 non-flood inventories were created for two different scenarios, with the only difference being the inclusion of stream buffers for overall analysis. The ML algorithms were compared and used to estimate flood susceptibility for each location in the geodatabase resulting in a flood-susceptibility map for both cases. Overall, testing results showed that the tree-based ensemble algorithms; XGB and RF ML models performed relatively well in both cases over multiple resolutions compared to other models in predicting flooding with an accuracy ranging from 0.82 to 0.97, respectively. Also, variable importance analysis depicted that predictor variables such as distance from the streams, hydrologic soil type, rainfall, elevation, and impervious surfaces significantly affect flood prediction.