Overview¶

The HLS STAC Geoparquet Archive is an unofficial copy of the HLS 2.0 granule STAC item metadata that is generated in the HLS pipeline. The data are stored in two hive-partitioned parquet datasets (one per collection, partitioned by year and month). The parquet files are updated every 5 days from CMR API Granule queries, covering both the previous month (straggler catch-up) and the current month (incremental build).

Warning: This archive is not guaranteed to contain all of the records available in CMR, particularly for the most recent months. If you need the most recent granules do not use this archive!

The parquet files can be accessed from the nasa-maap-data-store bucket in AWS S3 (us-west-2):

s3://nasa-maap-data-store/file-staging/nasa-map/hls-stac-geoparquet-archive/v2/{collection}/year={year}/month={month}/{collection}-{year}-{month}.parquet

where collection is either HLSL30_2.0 (Landsat) or HLSS30_2.0 (Sentinel-2).

Usage¶

rustac¶

The rustac package can be used to query the archive via the DuckdbClient interface. To use this approach your environment must be configured with AWS credentials that provide ListBucket access to the nasa-maap-data-store bucket in S3 (this will work in the MAAP hub).

Note: The HLSL30_2.0 and HLSS30_2.0 collections must be queried separately because the STAC items have slightly different parquet schemas.

In [1]:

Copied!





from rustac import DuckdbClient


client = DuckdbClient(use_hive_partitioning=True)

# configure duckdb to find S3 credentials for listing/reading the files in S3

# on the MAAP HUB
# aws_session = boto3.Session()
# creds = aws_session.get_credentials().get_frozen_credentials()
# client.execute(
#     f"""
#     CREATE OR REPLACE SECRET secret (
#         TYPE S3,
#         REGION '{aws_session.region_name}',
#         KEY_ID '{creds.access_key}',
#         SECRET '{creds.secret_key}',
#         SESSION_TOKEN '{creds.token}'
#     );
#     """
# )

# on the MAAP ADE
client.execute(
    """
    CREATE OR REPLACE SECRET secret (
        TYPE S3,
        PROVIDER credential_chain
    );
    """
)

parquet_href = "s3://nasa-maap-data-store/file-staging/nasa-map/hls-stac-geoparquet-archive/v2/{collection}/**/*.parquet"

datetime = "2025-05-01T00:00:00Z/2025-05-31T23:59:59Z"
bbox = (-90, 45, -85, 50)

hls_l30_items = client.search(
    href=parquet_href.format(collection="HLSL30_2.0"),
    datetime=datetime,
    bbox=bbox,
)
print(f"found {len(hls_l30_items)} HLSL30_2.0 items")

hls_s30_items = client.search(
    href=parquet_href.format(collection="HLSS30_2.0"),
    datetime=datetime,
    bbox=bbox,
)
print(f"found {len(hls_s30_items)} HLSS30_2.0 items")
from rustac import DuckdbClient


client = DuckdbClient(use_hive_partitioning=True)

# configure duckdb to find S3 credentials for listing/reading the files in S3

# on the MAAP HUB
# aws_session = boto3.Session()
# creds = aws_session.get_credentials().get_frozen_credentials()
# client.execute(
#     f"""
#     CREATE OR REPLACE SECRET secret (
#         TYPE S3,
#         REGION '{aws_session.region_name}',
#         KEY_ID '{creds.access_key}',
#         SECRET '{creds.secret_key}',
#         SESSION_TOKEN '{creds.token}'
#     );
#     """
# )

# on the MAAP ADE
client.execute(
    """
    CREATE OR REPLACE SECRET secret (
        TYPE S3,
        PROVIDER credential_chain
    );
    """
)

parquet_href = "s3://nasa-maap-data-store/file-staging/nasa-map/hls-stac-geoparquet-archive/v2/{collection}/**/*.parquet"

datetime = "2025-05-01T00:00:00Z/2025-05-31T23:59:59Z"
bbox = (-90, 45, -85, 50)

hls_l30_items = client.search(
    href=parquet_href.format(collection="HLSL30_2.0"),
    datetime=datetime,
    bbox=bbox,
)
print(f"found {len(hls_l30_items)} HLSL30_2.0 items")

hls_s30_items = client.search(
    href=parquet_href.format(collection="HLSS30_2.0"),
    datetime=datetime,
    bbox=bbox,
)
print(f"found {len(hls_s30_items)} HLSS30_2.0 items")

found 292 HLSL30_2.0 items
found 394 HLSS30_2.0 items

Example item¶

The items in the HLS STAC Geoparquet Archive were copied directly from the STAC item JSON files that are produced for every HLS granule (e.g. https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-public/HLSS30.020/HLS.S30.T21JXN.2025341T134221.v2.0/HLS.S30.T21JXN.2025341T134221.v2.0_stac.json).

In [2]:

Copied!

from pystac import Item

Item.from_dict(hls_s30_items[0])
from pystac import Item

Item.from_dict(hls_s30_items[0])

Out[2]:

type "Feature"
stac_version "1.1.0"
stac_extensions[] 4 items
- 0 "https://stac-extensions.github.io/eo/v1.1.0/schema.json"
- 1 "https://stac-extensions.github.io/projection/v2.0.0/schema.json"
- 2 "https://stac-extensions.github.io/view/v1.0.0/schema.json"
- 3 "https://stac-extensions.github.io/scientific/v1.0.0/schema.json"
id "HLS.S30.T15UYR.2025124T165711.v2.0"
geometry
- type "MultiPolygon"
- coordinates[] 1 items
  - 0[] 1 items
    
    0[] 5 items
    
    0[] 2 items
    
    0 -89.452584
    
    1 49.510345
    
    1[] 2 items
    
    0 -89.06861
    
    1 50.485868
    
    2[] 2 items
    
    0 -88.634257
    
    1 50.470361
    
    3[] 2 items
    
    0 -88.72263
    
    1 49.48562
    
    4[] 2 items
    
    0 -89.452584
    
    1 49.510345
bbox[] 4 items
- 0 -89.452584
- 1 49.48562
- 2 -88.634257
- 3 50.485868
properties
- sci:doi "10.5067/HLS/HLSS30.002"
- view:azimuth 105.51786709
- datetime "2025-05-04T17:00:02.065000Z"
- start_datetime "2025-05-04T17:00:02.065+00:00"
- end_datetime "2025-05-04T17:00:02.065+00:00"
- platform "sentinel-2a"
- instruments[] 1 items
  - 0 "msi"
- eo:cloud_cover 0.0
- proj:transform[] 9 items
  - 0 30.0
  - 1 0.0
  - 2 699960.0
  - 3 0.0
  - 4 -30.0
  - 5 5600040.0
  - 6 0.0
  - 7 0.0
  - 8 1.0
- proj:shape[] 2 items
  - 0 3660
  - 1 3660
- view:sun_azimuth 157.81697088
- processing:software None
- month 5
- year 2025
- proj:code "EPSG:32615"
links[] 2 items
- 0
  - rel "self"
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-public/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0_stac.json"
  - type "application/json"
- 1
  - rel "cite-as"
  - href "https://doi.org/10.5067/HLS/HLSS30.002"
assets
- B01
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B01.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B01"
    
    common_name "coastal"
    
    center_wavelength 0.4439
    
    full_width_half_max 0.027
  - roles[] 1 items
    
    0 "data"
- B02
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B02.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B02"
    
    common_name "blue"
    
    center_wavelength 0.4966
    
    full_width_half_max 0.098
  - roles[] 1 items
    
    0 "data"
- B03
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B03.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B03"
    
    common_name "green"
    
    center_wavelength 0.56
    
    full_width_half_max 0.045
  - roles[] 1 items
    
    0 "data"
- B04
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B04.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B04"
    
    common_name "red"
    
    center_wavelength 0.6645
    
    full_width_half_max 0.038
  - roles[] 1 items
    
    0 "data"
- B05
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B05.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B05"
    
    center_wavelength 0.7039
    
    full_width_half_max 0.019
  - roles[] 1 items
    
    0 "data"
- B06
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B06.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B06"
    
    center_wavelength 0.7402
    
    full_width_half_max 0.018
  - roles[] 1 items
    
    0 "data"
- B07
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B07.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B07"
    
    center_wavelength 0.7825
    
    full_width_half_max 0.028
  - roles[] 1 items
    
    0 "data"
- B08
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B08.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B08"
    
    common_name "nir"
    
    center_wavelength 0.8351
    
    full_width_half_max 0.145
  - roles[] 1 items
    
    0 "data"
- B8A
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B8A.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B8A"
    
    center_wavelength 0.8648
    
    full_width_half_max 0.033
  - roles[] 1 items
    
    0 "data"
- B09
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B09.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B09"
    
    center_wavelength 0.945
    
    full_width_half_max 0.026
  - roles[] 1 items
    
    0 "data"
- B10
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B10.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B10"
    
    common_name "cirrus"
    
    center_wavelength 1.3735
    
    full_width_half_max 0.075
  - roles[] 1 items
    
    0 "data"
- B11
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B11.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B11"
    
    common_name "swir16"
    
    center_wavelength 1.6137
    
    full_width_half_max 0.143
  - roles[] 1 items
    
    0 "data"
- B12
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.B12.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "B12"
    
    common_name "swir22"
    
    center_wavelength 2.22024
    
    full_width_half_max 0.242
  - roles[] 1 items
    
    0 "data"
- Fmask
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.Fmask.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "Fmask"
  - roles[] 1 items
    
    0 "data"
- SZA
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.SZA.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "SZA"
  - roles[] 1 items
    
    0 "data"
- SAA
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.SAA.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "SAA"
  - roles[] 1 items
    
    0 "data"
- VZA
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.VZA.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "VZA"
  - roles[] 1 items
    
    0 "data"
- VAA
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.VAA.tif"
  - type "image/tiff; application=geotiff; profile=cloud-optimized"
  - eo:bands[] 1 items
    
    0
    
    name "VAA"
  - roles[] 1 items
    
    0 "data"
- thumbnail
  - href "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-public/HLSS30.020/HLS.S30.T15UYR.2025124T165711.v2.0/HLS.S30.T15UYR.2025124T165711.v2.0.jpg"
  - type "image/jpeg"
  - roles[] 1 items
    
    0 "thumbnail"
collection "HLSS30_2.0"

Comparison to CMR API granules¶

This archive is generated by running granule queries for the HLS collections from the CMR API and represents a snapshot of a dynamic catalog. The archive is updated every 5 days, covering both the previous month (to catch stragglers) and the current month (incremental updates as new granules are published). This is bound to be a partially incomplete copy of the canonical source, but it should have 99% of the full set of granules.

init_notebook_modetrusted

Loading ITables v2.8.1 from the init_notebook_mode cell... (need help?)

In [ ]: