GFM data discovery and download

EODC catalogs several datasets using the STAC (SpatioTemporal Asset Catalog) specification. By providing a STAC API endpoint, we enable users to search our datasets by space, time and more filter criterias depending on the individual dataset.

In this notebook, we demonstrate how to query the Global Flood Monitoring (GFM) STAC collection using the Python library pystac_client and download the data using built-in Python libraries as well as utilizing the command line tool stac-asset.

You can install the pystac_client via pip:

pip install pystac_client

In the STAC items the respective assets (=file) are linked. These links are used to download the file to a specified folder on your machine.

from datetime import datetime
from pystac_client import Client

# EODC STAC API URL
api_url = "https://stac.eodc.eu/api/v1"

eodc_catalog = Client.open(api_url)

Gridding of the GFM data sets

The GFM service processes all observations from the Sentinel-1A/B (soon including Sentinel-1C) satellites that are acquired over land in Interferometric Wide-swath mode and Ground Range Detected at High resolution (Sentinel-1 IW GRDH).

The GFM service uses the Equi7Grid that employs the equidistant azimuthal projection and divides the Earth surface into seven continental zones. The Equi7Grid with a 20m pixel spacing and a 300km gridding (T3 level) serves as efficient working grid representation for all steps in the data processing workflow. Consequently, all input datasets, including auxiliary datasets from external sources, must be re-projected to the Equi7Grid beforehand.

The spatial extent of a Sentinel-1 scene is too large to be represented on only one Equi7Tile (=file). Therefore, a search query usually will result with multiple items, even for the same timestamp of a single Sentinel-1 observation.

Searching

We can use the STAC API to find items that match specific criteria. This may include the date and time the item covers, its spatial extent, or any other property saved in the item’s metadata.

If a specific Sentinel-1 scene is of interest, it is also possible to directly use the sensing date in the search query.

Search with AOI and time range

In this example we are searching for GFM data which cover our area of interest over South Pakistan in September 2022.

The area of interest can be specified as bbox using the Python library shapely or, alternateively, as GeoJSON object.

The time range can be specified as tuples of datetime object or simply using strings.

from shapely.geometry import box

# STAC collection ID
collection_id = "GFM"

# Time range
time_range = (datetime(2022, 9, 15, 0, 0, 0), datetime(2022, 9, 16, 23, 59, 59))
time_range = '2022-09-15/2022-09-16'

# Area of interest (South Pakistan)
aoi = box(63.0, 24.0, 73, 27.0)

aoi = {
    "type" : "Polygon",
    "coordinates": [
        [
            [73.0, 24.0],
            [73.0, 27.0],
            [63.0, 27.0],
            [63.0, 24.0],
            [73.0, 24.0],
        ]
    ],
}

search = eodc_catalog.search(
    max_items=1000,
    collections=collection_id,
    intersects=aoi,
    datetime=time_range
)

items_eodc = search.item_collection()
print(f"On EODC we found {len(items_eodc)} items for the given search query")

Search with Sentinel-1 scene identifier

In this example we are using a single Sentinel-1 scene identifier to retrieve the respective STAC items. Either use the following simple method to derive the sensing date from the Sentinel-1 scene identifier or directly use the exact datetime in the query.

# Method to derive the sensing date from a Sentinel-1 scene identifier
def get_sensing_date(scene:str) -> datetime:
    parts = scene.split("_")
    return datetime.strptime(parts[4], "%Y%m%dT%H%M%S")
# Define Sentinel-1 scene identifier and asset name to plot
scene_id = "S1A_IW_GRDH_1SDV_20220930T224602_20220930T224627_045240_056863"

api_url = "https://stac.eodc.eu/api/v1"
eodc_catalog = Client.open(api_url)

search = eodc_catalog.search(
    collections=["GFM"],
    datetime=get_sensing_date(scene_id),
)

items_eodc = search.item_collection()
print(f"On EODC we found {len(items_eodc)} items for the given search query")

Some information about the found STAC items

We can print some more information like the available assets and their description.

import rich.table
from rich.console import Console

console = Console()

first_item = items_eodc[0]

table = rich.table.Table(title="Assets in STAC Item")
table.add_column("Asset Key", style="cyan", no_wrap=True)
table.add_column("Description")
for asset_key, asset in first_item.assets.items():
    table.add_row(
        asset.title, 
        asset.description)

console.print(table)

Download data with Python

You can download the desired assets by specifying their respective asset keys in a list object. Then, iterate over all found items and specified asset keys to download the data to a local directory. The HTTP link saved in the asset references the actual file, which is downloaded using the Python library urllib.

import os
import urllib

# specify output directory
download_root_path = "./downloaded_data/"

# specify asset names to download
asset_names = ["ensemble_flood_extent", "tuw_flood_extent"]

for item in items_eodc[:2]:
    download_path = os.path.join(download_root_path, item.collection_id, item.id)
    
    os.makedirs(download_path, exist_ok=True)
    
    for asset_name in asset_names:
        asset = item.assets[asset_name]
        if "data" in asset.roles:
            fpath = os.path.join(download_path, os.path.basename(asset.href))
            print(f"Downlading {fpath}")
            urllib.request.urlretrieve(asset.href, fpath)

print("Download done!")

Download data with stac-asset CLI

The command line tool stac-asset provides another way to query STAC APIs and download found assets.

You can install stac-asset via pip:

pip install 'stac-asset[cli]'

stac-assets expects the same input parameters as described above:

  • STAC API URL

    • https://stac.eodc.eu/api/v1

  • Collection ID

    • GFM

  • Bounding box

    • 63.0, 24.0, 73.0, 27.0 (minX, minY, maxX, maxY)

  • Time range

    • 2022-09-15/2022-09-16

List the number of matched STAC items

!stac-client search https://stac.eodc.eu/api/v1 -c GFM --bbox 63 24 73 27 --datetime 2022-09-15/2022-09-16 --matched

Save matched STAC items into a JSON file (items.json)

!stac-client search https://stac.eodc.eu/api/v1 -c GFM --bbox 63 24 73 27 --datetime 2022-09-15/2022-09-16 --save items.json

Download a specified asset of found STAC items into a given directory

!mkdir -p ./stac_asset_download
!stac-asset download -i ensemble_flood_extent items.json ./stac_asset_download -q

Pipe query results directly into the stac-asset download command

!mkdir -p ./stac_asset_download
!cd stac_asset_download; stac-client search https://stac.eodc.eu/api/v1 -c GFM --bbox 63 24 73 27 --datetime 2022-09-15/2022-09-16 | stac-asset download -i ensemble_flood_extent -q