Landslide Susceptibility Model

Landslides are the third biggest natural disaster in the world, with India experiencing the biggest bulk of them. 15 percent of the country is prone to landslides and India has the highest number of landslide deaths in the world. Landslides cause severe damage to the lives and infrastructure, blocking the natural routes of water, and affecting the local citizens’ economic and social conditions.

A landslide is a rare event, a short-lived, critical, and sudden occurring geological phenomenon. Moreover, it is a region rather than a site-specific hazard, depending on the regional environment and geological and topographical conditions. It is a complex phenomenon, which is caused by several factors including slope geometry, soil quality, soil moisture, rainfall, aspect, the vegetation index, construction in that region, load on the surface area, and distance from the roads and rivers. The influence of these factors may vary across regions. This makes landslide susceptibility assessment difficult because diverse and large amounts of spatial data must be acquired and considered in the analysis procedure.

Landslide susceptibility assessment methods can be classified into qualitative (knowledge-driven) and quantitative (data-driven).

Generally, knowledge-driven approaches are based entirely on the judgment of the experts who conduct the susceptibility assessment, Knowledge-driven methods are seldom used for susceptibility assessment over large areas because they lack a concrete physical concept of slope failure.

Data-driven methods evaluate the statistical relationships between the locations of landslides that occurred in the past and landslide-inducing factors, and then, quantitative predictions are made for landslide-free areas with similar conditions. The underlying working hypothesis is that future landslides will have similar distribution patterns to those that occurred in the past. This method is very useful for mapping hazards over large areas, thereby making more effective warning systems.

With the availability of landslide inventory maps in the public domain which have location information, types of failures, geometries, date of occurrence, triggering factors, possible failure mechanisms, and damage caused, data-driven methods have gained more popularity over knowledge-driven approaches. Any data-driven landslide susceptibility map regardless of the method used to construct it can be validated against the landslide inventories.

Before geospatial technologies such as remote sensing and geographic information systems (GIS), the difficulty in using data-driven methods was the collection of data regarding the landslide distribution and factor maps over large areas.

As the typical landslide analysis demands, collection of numerous data, storage of them, and using them in the analysis could be handled well in the GIS environment. In particular, the ability of GIS to present the data and analysis results in map forms plays a key role in identifying the critical areas. Thereby helping geologists, planners, and local governments to be better prepared for extreme or unanticipated events.

Our Approach

The preparation of an inventory map is a significant aspect of landslide susceptibility assessment. The landslide inventory map helps to understand the relationship between the distributions of historical landslides and selected conditioning factors. The locations of past landslides are obtained from the Global Landslide Catalog and Geological Survey of India's landslide inventory. At least 80% of machine learning is data preprocessing, which means that the performance of machine learning methods depends on the data quality. We filtered out small landslides based on the affected areas. For landslide susceptibility modeling purposes landslide inventories can be represented as polygons or points which represent the location of landslide occurrence.

Seven landslide predisposing factors were considered:

  • Distance from Structural lineaments,
  • Forest loss
  • Distance from Road
  • Human Modification index
  • Slope
  • Tree canopy height
  • Clay content

For each landslide location point in the inventory, predisposing factors for that location are computed leveraging GIS technologies.

Based on these data, the predictive model or Classifier evaluates the combined relationship between the landslide distribution and a set of independent variables (landslide predisposing factors) and classifies a new location's chance of being landslide susceptible.

We experimented with multiple predictive models. Random forest yielded better results when validated against the landslide inventory. The random forest is a classification algorithm that builds multiple classifiers and merges them to get a more accurate and stable prediction. Results of the model are mapped to a geographical map, and then the map is colored based on the area’s susceptibility. These generated maps make visual identification of landslide-susceptible regions easy. As these generated maps are geo-referenced, location-based analytics, computations, and warning systems can be integrated into various platforms such as mobiles, laptops, web, etc.

Model Source code

Implementation of the model is available under GPL v3 license and published on Google Earth Engine. Feel free to access and modify the code. Write to us on csl@gpsrenewables.com.