STAC Generator Architecture
Workflow
-
I/O
Users provides as input various configuration formats. Configurations can be paths to json/yaml config files, python dictionaries or subclasses of
SourceConfig. -
Classification/Validation
The configs are passed to
StacGeneratorFactorywhich matches raw configurations to the appropriate configuration subclass -RasterConfig,VectorConfigandPointConfig. The matching process is performed using the config'slocationextension. The promotion step also involves data validation, which checks whether the required fields are provided. -
Conversion to ItemGenerator
The
SourceConfiginstances are then promoted to an appropriateItemGeneratorinstance (RasterGenerator,VectorGenerator, andPointGenerator). -
Instantiate CollectionGenerator
Together with the list of
ItemGeneratorsubjects, a set of collection's fields and keywords is used to instantiate theCollectionGeneratorobject. -
Serialisation
Using the
CollectionGeneratorobject, the STAC generator uses theStacSerialiserclass for writing the metadata locally or to a remote API.
FAQ
Handling of Collection's spatial extent attributes
The spatial extent is determined as enclosing bounding box of all items' bounding boxes. This bounding box is in WGS 84.
Handling of Collection's temporal extent attributes
The temporal extent is determined as the minimum start_datetime and the maximum end_datetime in UTC.
Handling of Item's geometry attributes
The item's geometry is read from the asset. If the asset's geometry is not in WGS 84 (EPSG 4326), the values are converted to WGS 84 before serialisation.
Handling of Item's bbox attributes
An item's bounding box (top, left, bottom, right) is determined from the smallest bounding box that encloses the item's geometry. The values are converted to WGS 84 before serialisation.
Handling of Item's datetime attributes
The item's datetime value is determined from the config's fields collection_date, collection_time, and timezone. If timezone is not provided or timezone is local, the data's timezone is inferred using the asset's geometry. The collection_date and collection_time are then combined and converted from the timezone value to utc.
Handling of Item's start_datetime and end_datetime attributes
For asset that are time-series based (either a point asset with a T attribute or a joined vector asset with a date_column in join_config), the date values are extracted from the asset. Any timestamp that are not timezone-awared will be assigned a timezone value based on the timezone config field as described previously. All timestamps are then converted to UTC. start_datetime and end_datetime are the minimum and maximum values of the UTC timestamps.
For assets that do not contain timeseries, start_datetime and end_datetime are assigned the value of datetime.
Handling of Item's assets attributes
Unlike a generic STAC item that can contain multiple assets, a STAC Generator generated Item contains only a single asset which has the key data. The asset's role is also data.
Handling of Item's property attributes
Each STAC Generator generated STAC Item contains an object under the key stac_generator in properties. The object is required for subsequent asset parsing in the mccn-engine.