An Assessment of Negative Samples and Model Structures in Landslide Susceptibility Characterization Based on Bayesian Network Models

Abstract

Landslide susceptibility mapping (LSM) characterizes landslide potential, which is essential for assessing landslide risk and developing mitigation strategies. Despite the significant progress in LSM research over the past two decades, several long-standing issues, such as uncertainties related to training samples and model selection, remain inadequately addressed in the literature. In this study, we employed a physically based susceptibility model, PISA-m, to generate four different non-landslide data scenarios and combine them with mapped landslides from Magoffin County, Kentucky, for model training. We utilized two Bayesian network model structures, Naïve Bayes (NB) and Tree-Augmented Naïve Bayes (TAN), to produce LSMs based on regional geomorphic conditions. After internal validation, we evaluated the robustness and reliability of the models using an independent landslide inventory from Owsley County, Kentucky. The results revealed considerable differences between the most effective model in internal validation (AUC = 0.969), which used non-landslide samples extracted exclusively from low susceptibility areas predicted by PISA-m, and the models’ unsatisfactory performance in external validation, as manifested by the identification of only 79.1% of landslide initiation points as high susceptibility areas. The obtained results from both internal and external validation highlighted the potential overfitting problem, which has largely been overlooked by previous studies. Additionally, our findings also indicate that TAN models consistently outperformed NB models when training datasets were the same due to the ability to account for variables’ dependencies by the former.

Publication
Remote Sensing