Data inspection

Before diving into the machine learning, we first want to take a look at our data. In addition to looking at the tabular data frames, it is always useful to visualize the data, especially in the context of remote sensing applications.

You can plot all data loaded into an xarray DataSet using xarrays plotting routines. So, for example, the satellite data can be viewed as follows:

<matplotlib.collections.QuadMesh at 0x7fb8f55e3a10>

Or, to be more flexible and in order to easily create composites, you can use the matplotlib library:

import numpy as np

# Helper function to normalize array to range 0-1 (with clips at lower and upper percentiles (1% - 99%))
def normArray(x):
    lp = 1
    up = 99
    x = np.clip(x, np.nanpercentile(x, lp), np.nanpercentile(x, up))
    x = (x - np.nanmin(x)) / (np.nanmax(x) - np.nanmin(x))
    return x
import matplotlib.pyplot as plt

red = normArray(satellite_data.B12)
green = normArray(satellite_data.B9)
blue = normArray(satellite_data.B6)

fig, ax = plt.subplots(1,1,figsize=(20,12))
im = ax.imshow(np.array([red[1000:1500,0:700],green[1000:1500,0:700],blue[1000:1500,0:700]]).transpose(1,2,0))
ax.set_title("Overview of exemplary region (False Color Composite) after flood event")


Visualize all data that is provided with this case study. Get an overview of the whole study domain and take a look at the training and testing regions. Try to find a visualization that helps to spot flooded areas.