GFM data discovery and download

EODC catalogs several datasets using the STAC (SpatioTemporal Asset Catalog) specification. By providing a STAC API endpoint, we enable users to search our datasets by space, time and more filter criterias depending on the individual dataset.

In this notebook, we demonstrate how to query the Global Flood Monitoring (GFM) STAC collection using the Python library pystac_client and download the data using built-in Python libraries as well as utilizing the command line tool stac-asset.

You can install the pystac_client via pip:

pip install pystac_client

In the STAC items the respective assets (=file) are linked. These links are used to download the file to a specified folder on your machine.

from pystac_client import Client

# EODC STAC API URL
api_url = "https://stac.eodc.eu/api/v1"

eodc_catalog = Client.open(api_url)

Searching

We can use the STAC API to find items that match specific criteria. This may include the date and time the item covers, its spatial extent, or any other property saved in the item’s metadata.

In this example we are searching for GFM data which cover our area of interest over South Pakistan in September 2022.

The area of interest can be specified as bbox using the Python library shapely or, alternateively, as GeoJSON object.

The time range can be specified as tuples of datetime object or simply using strings.

from datetime import datetime
from shapely.geometry import box

# STAC collection ID
collection_id = "GFM"

# Time range
time_range = (datetime(2022, 9, 15, 0, 0, 0), datetime(2022, 9, 16, 23, 59, 59))
time_range = '2022-09-15/2022-09-16'

# Area of interest (South Pakistan)
aoi = box(63.0, 24.0, 73, 27.0)

aoi = {
    "type" : "Polygon",
    "coordinates": [
        [
            [73.0, 24.0],
            [73.0, 27.0],
            [63.0, 27.0],
            [63.0, 24.0],
            [73.0, 24.0],
        ]
    ],
}

search = eodc_catalog.search(
    max_items=1000,
    collections=collection_id,
    intersects=aoi,
    datetime=time_range
)

items_eodc = search.item_collection()
print(f"On EODC we found {len(items_eodc)} items for the given search query")
On EODC we found 21 items for the given search query

Some information about the found STAC items

We can print some more information like the available assets and their description.

import rich.table
from rich.console import Console

console = Console()

first_item = items_eodc[0]

table = rich.table.Table(title="Assets in STAC Item")
table.add_column("Asset Key", style="cyan", no_wrap=True)
table.add_column("Description")
for asset_key, asset in first_item.assets.items():
    table.add_row(
        asset.title, 
        asset.description)

console.print(table)
                                                Assets in STAC Item                                                
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Asset Key                        Description                                                                   ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ TileJSON with default rendering │                                                                               │
│ Rendered preview                │ Rendered preview image of GFM Observed Flood Extent (Final output - Ensemble  │
│                                 │ algorithm)                                                                    │
│ advisory_flags                  │ GFM Advisory Flags                                                            │
│ dlr_likelihood                  │ GFM Likelihood (Intermediate output - DLR algorithm)                          │
│ exclusion_mask                  │ GFM Exclusion Mask (Final output - Ensemble algorithm)                        │
│ tuw_likelihood                  │ GFM Uncertainties (Intermediate output - TUW algorithm)                       │
│ list_likelihood                 │ GFM Likelihood (Intermediate output - LIST algorithm)                         │
│ dlr_flood_extent                │ GFM Observed Flood Extent (Intermediate output - DLR algorithm)               │
│ tuw_flood_extent                │ GFM Observed Flood Extent (Intermediate output - TUW algorithm)               │
│ list_flood_extent               │ GFM Observed Flood Extent (Intermediate output - LIST algorithm)              │
│ ensemble_likelihood             │ GFM Likelihood (Final output - Ensemble algorithm)                            │
│ reference_water_mask            │ GFM Reference Water Mask                                                      │
│ ensemble_flood_extent           │ GFM Observed Flood Extent (Final output - Ensemble algorithm)                 │
│ ensemble_water_extent           │ GFM Observed Water Extent (Final output - Ensemble algorithm)                 │
└─────────────────────────────────┴───────────────────────────────────────────────────────────────────────────────┘

Download data with Python

You can download the desired assets by specifying their respective asset keys in a list object. Then, iterate over all found items and specified asset keys to download the data to a local directory. The HTTP link saved in the asset references the actual file, which is downloaded using the Python library urllib.

import os
import urllib

# specify output directory
download_root_path = "./downloaded_data/"

# specify asset names to download
asset_names = ["ensemble_flood_extent", "tuw_flood_extent"]

for item in items_eodc[:2]:
    download_path = os.path.join(download_root_path, item.collection_id, item.id)
    
    os.makedirs(download_path, exist_ok=True)
    
    for asset_name in asset_names:
        asset = item.assets[asset_name]
        if "data" in asset.roles:
            fpath = os.path.join(download_path, os.path.basename(asset.href))
            print(f"Downlading {fpath}")
            urllib.request.urlretrieve(asset.href, fpath)

print("Download done!")
Downlading ./downloaded_data/GFM/ENSEMBLE_FLOOD_20220916T013450_VV_AS020M_E015N027T3/ENSEMBLE_FLOOD_20220916T013450_VV_AS020M_E015N027T3.tif
Downlading ./downloaded_data/GFM/ENSEMBLE_FLOOD_20220916T013450_VV_AS020M_E015N027T3/TUW_FLOOD_20220916T013450_VV_AS020M_E015N027T3.tif
Downlading ./downloaded_data/GFM/ENSEMBLE_FLOOD_20220916T013450_VV_AS020M_E012N027T3/ENSEMBLE_FLOOD_20220916T013450_VV_AS020M_E012N027T3.tif
Downlading ./downloaded_data/GFM/ENSEMBLE_FLOOD_20220916T013450_VV_AS020M_E012N027T3/TUW_FLOOD_20220916T013450_VV_AS020M_E012N027T3.tif
Download done!

Download data with stac-asset CLI

The command line tool stac-asset provides another way to query STAC APIs and download found assets.

You can install stac-asset via pip:

pip install 'stac-asset[cli]'

stac-assets expects the same input parameters as described above:

  • STAC API URL

    • https://stac.eodc.eu/api/v1

  • Collection ID

    • GFM

  • Bounding box

    • 63.0, 24.0, 73.0, 27.0 (minX, minY, maxX, maxY)

  • Time range

    • 2022-09-15/2022-09-16

List the number of matched STAC items

!stac-client search https://stac.eodc.eu/api/v1 -c GFM --bbox 63 24 73 27 --datetime 2022-09-15/2022-09-16 --matched
21 items matched

Save matched STAC items into a JSON file (items.json)

!stac-client search https://stac.eodc.eu/api/v1 -c GFM --bbox 63 24 73 27 --datetime 2022-09-15/2022-09-16 --save items.json

Download a specified asset of found STAC items into a given directory

!mkdir -p ./stac_asset_download
!stac-asset download -i ensemble_flood_extent items.json ./stac_asset_download -q

Pipe query results directly into the stac-asset download command

!mkdir -p ./stac_asset_download
!cd stac_asset_download; stac-client search https://stac.eodc.eu/api/v1 -c GFM --bbox 63 24 73 27 --datetime 2022-09-15/2022-09-16 | stac-asset download -i ensemble_flood_extent -q