An introduction to the Descartes Labs Catalog API and its key concepts, classes, and methods.
Overview
The Descartes Labs Catalog API is a search and retrieval service designed for extremely high throughput access to raster image data. This article provides an overview of the base classes and methods in the Catalog API to get you up and running with the platform and assumes you have already installed and authenticated the Descartes Labs client on your Python environment.
Please visit the Catalog Guide for a more in-depth primer on the Catalog methods.
This document gives a high-level introduction to the Catalog API. For more detailed, practically applied tutorial notebooks please reference the Example Notebooks on GitHub or install them by running the following command:
git clone https://github.com/descarteslabs/example-notebooks.git
Common Objects
Products
The core class within the Catalog raster data model is the Product. A Product is a collection of image data with the same band information stored as pixels. Typically, Products correspond to a single satellite platform, such as Sentinel-2, but more specifically a collection of image data with the same processing applied such as Sentinel-2 L2A.
Products are referenced by their unique IDs. In the case of Sentinel-2 L2A, the Descartes Labs core Product ID is esa:sentinel-2:l2a:c1:v1
.There are two types of Products: Core and Personal. Core Products are those maintained by Descartes Labs and Personal Products are those owned and shared by you, the user.
You can access a Product through the API through the Product.get() call:
from descarteslabs.catalog import Product
s2_product = Product.get("esa:sentinel-2:l2a:c1:v1")
s2_product
Product: Sentinel-2 L2A Collection 1
id: esa:sentinel-2:l2a:c1:v1
created: Wed Jan 17 18:58:54 2024
Note: Search and visualize which data Products you have available to you, and retrieve their IDs, from Explorer.
Images
Each Product stored in the Catalog contains a series of individual Images, which themselves represent raster data of any number of dimensions, or bands. Beyond simply storing the pixel data, Images also have several useful attributes for filtering, intersecting, and searching, such as:
- geometry outline of the scene the Image represents
- cloud fraction is common for optical data
- several datetime fields including acquired, created, and modified dates
- extra properties, a generic Python data dictionary
ImageCollections
When dealing with Images, we typically define a series of filters on a Product to identify a spatiotemporal subset of imagery, or an ImageCollection. ImageCollections hold Images, as well as the common methods for loading and interrogating their underlying pixel data.
A typical spatiotemporal filter to retrieve an ImageCollection takes a geometry, a start, and end date to filter a Product's imagery through its properties:
from descarteslabs.catalog import properties as p
from shapely import wkt
wkt_str = 'POLYGON ((-74.03 40.70, -73.91 40.70, -73.91 40.79, -74.03 40.79, -74.03 40.70))'
geom = wkt.loads(wkt_str)
start = '2024-01-01'
end = '2024-05-01'
image_col = (
s2_product.images()
.intersects(geom)
.filter(start < p.acquired < end)
.filter(p.cloud_fraction < 0.3)
).collect()
image_col
ImageCollection of 16 images
* Dates: Jan 02, 2024 to Apr 29, 2024
* Products: esa:sentinel-2:l2a:c1:v1: 16
Bands
All Images within a Product must contain the same Bands, otherwise referenced as channels. All Bands, regardless of their type, contain a:
- Name, which must be unique for each Product (Required)
- ID, which is automatically generated as Product ID + Band Name
- Data Type for the pixel values in each band (Required)
- Data Range, which is the range of pixel values for each band (Required)
- Spatial Resolution (Required)
- A valid NoData value representing missing or masked out data in each band (Required)
- Band Index, corresponding to the index of each band to the source data (Required)
- File Index, corresponding to the index of each band's file in the source data (Required)
There are several subtypes of Bands within the Catalog API, as well as unique attributes for each, that one should be aware of.
Spectral Bands
A SpectralBand represents a range of wavelengths on the electromagnetic spectrum. Spectral Bands typically also contain:
- Physical Range and Physical Range Unit that pixel values map to
- Wavelength Min, Max, Center, and Full Width at Half Maximum (FWHM) values, in nanometers
Microwave Bands
A MicrowaveBand contains data on the microwave spectrum, typically for SAR or passive radar sensors. These bands typically also have the following unique attributes:
Classified Bands
A classified, or ClassBand is generally used for finite values that may or may not be continuous, such as the results of a Land Use/Land Classification model. Typically these bands also contain a Colormap, Colormap Name, and Class Labels.
Mask Bands
MaskBands are binary bands frequently used to mask out portions of an image, and are binary (0,1).
Generic Bands
Any other data that does not fall within one of the predefined band models can be stored as a GenericBand.
Blobs
The Descartes Labs Catalog API also supports arbitrary data storage and access through the Blob class. Blobs can be any type of data - whether stored in-memory or any file stored on disk. Typical use cases for Blobs include storing pre-trained model weights files for inference through Batch Compute and to retrieve the results of a completed Job.
Common Concepts
Quotas and Limits
The Catalog API is designed for extremely high rates of search and retrieval of the above defined objects. Commonly accessed from asynchronous Batch Compute nodes, these limits are typically on the order of thousands of queries per minute. Please reference the Quotas & Limits Documentation page for more detailed information.