Utility
parse_href(base_url, collection_id, item_id=None)
Generate href for collection or item based on id. This is used for generating STAC API URL.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_url
|
str
|
base url |
required |
collection_id
|
str
|
collection id |
required |
item_id
|
str | None
|
item's id. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
built url |
href_is_stac_api_endpoint(href)
Check if href points to a resource behind a stac api
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
href
|
str
|
url |
required |
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
boolean result |
force_write_to_stac_api(url, id, json)
Force write a json object to a stac api endpoint.
Initially try to POST the json. If 409 error encountered, will try a PUT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str
|
endpoint url |
required |
id
|
str
|
collection's id |
required |
json
|
dict[str, Any]
|
json body |
required |
Raises:
| Type | Description |
|---|---|
err
|
error encountered other than integrity error |
read_source_config(href)
Read in config from location
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
href
|
str
|
config location |
required |
Raises:
| Type | Description |
|---|---|
InvalidExtensionException
|
if an unrecognised extension is provided. Only accepts json, yaml, yml, csv |
StacConfigException
|
if the config file cannot be read |
ConfigFormatException
|
if the config is not a dictionary or a list |
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
list[dict[str, Any]]: list of raw configs as dictionaries. |
calculate_timezone(geometry)
Method to calculate timezone string from a geometry or a sequence of geometries
If a sequence of geometries is provided, the timezone is provided for the centroid of the sequence of geometries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
geometry
|
Geometry | Sequence[Geometry]
|
geometry object |
required |
Raises:
| Type | Description |
|---|---|
TimezoneException
|
if timezone cannot be determined from geometry |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
timezone string |
get_timezone(timezone, geometry)
Get timezone string based on provided timezone option and geometry.
This invokes the calculate_timezone method under the hood if appropriate.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timezone
|
str | Literal["local", "utc"]
|
timezone parameter from SourceConfig |
required |
geometry
|
Geometry | Sequence[Geometry]
|
asset's geometry. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
timezone string |
localise_timezone(data, tzinfo)
Add timezone information to data then converts to UTC
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Timestamp | TimeSeries
|
series of timestamps or a single timestamp |
required |
tzinfo
|
str
|
parsed timezone |
required |
Raises:
| Type | Description |
|---|---|
TimezoneException
|
an invalid timezone is provided |
Returns:
| Type | Description |
|---|---|
Timestamp | TimeSeries
|
Timestamp | TimeSeries: utc localised timestamp |
is_string_convertible(value)
Check whether value is string or path
If value is Path, converts to string via as_posix
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
Any
|
input value |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
is not a string or Path |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
string path |
read_point_asset(src_path, X_coord, Y_coord, epsg, Z_coord=None, T_coord=None, date_format='ISO8601', columns=None, timezone='local')
Read in point data from disk or remote
Users must provide at the bare minimum the location of the csv, and the names of the columns to be treated as the X and Y coordinates. By default, will read in all columns in the csv. If columns and groupby columns are provided, will selectively read specified columns together with the coordinate columns (X, Y, T).
Timezone information is used to convert all timestamps to timezone-aware timestamps. Timestamps that are originally timezone awared will not be affected. Timestamps that are originally non-timezone awared will be embeded with timezone information. Timestamps are subsequently converted to UTC.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
src_path
|
str
|
source location |
required |
X_coord
|
str
|
column to be treated as the x_coordinate |
required |
Y_coord
|
str
|
column to be treated as the y coordinate |
required |
epsg
|
int
|
epsg code |
required |
Z_coord
|
str | None
|
column to be treated as the z coordinate. Defaults to None. |
None
|
T_coord
|
str | None
|
column to be treated as timestamps. Defaults to None. |
None
|
date_format
|
str
|
date intepretation method. Defaults to "ISO8601". |
'ISO8601'
|
columns
|
set[str] | set[ColumnInfo] | Sequence[str] | Sequence[ColumnInfo] | None
|
columns to be read from the point asset. Defaults to None. |
None
|
timezone
|
str | Literal["utc", "local"]
|
timezone parameter for embedding non-timezone-aware timestamps. Defaults to "local". |
'local'
|
Returns:
| Type | Description |
|---|---|
gpd.GeoDataFrame
|
gpd.GeoDataFrame: read dataframe |
read_vector_asset(src_path, bbox=None, columns=None, layer=None)
Read in vector asset from disk or remote.
Users can provide an optional bbox for constraining the region of the vector data to be read, a set of columns describing the attributes of interest, and a layer parameter if the asset is a multilayered vector asset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
src_path
|
str | Path
|
path to asset. |
required |
bbox
|
tuple[float, float, float, float] | None
|
bbox to define the region of interest. Defaults to None. |
None
|
columns
|
set[str] | Sequence[str] | None
|
sequence of columns to be read from the vector file. Defaults to None. |
None
|
layer
|
str | int | None
|
layer indentifier for a multilayered asset. Defaults to None. |
None
|
Raises:
| Type | Description |
|---|---|
StacConfigException
|
if the provided layer is non-existent |
SourceAssetException
|
if the asset cannot be accessed or is malformatted |
Returns:
| Type | Description |
|---|---|
gpd.GeoDataFrame
|
gpd.GeoDataFrame: read dataframe |
read_join_asset(src_path, right_on, date_format, date_column, columns, tzinfo)
Read the join asset from disk or remote
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
src_path
|
str
|
path to join asset |
required |
right_on
|
str
|
right on attribute from join config |
required |
date_format
|
str
|
date format from join config |
required |
date_column
|
str | None
|
date column from join config |
required |
columns
|
set[str] | Sequence[str] | set[ColumnInfo] | Sequence[ColumnInfo]
|
list of columns to be read in from the asset |
required |
tzinfo
|
str
|
timezone information - already parsed using get_timezone |
required |
Returns:
| Type | Description |
|---|---|
pd.DataFrame
|
pd.DataFrame: description |
extract_epsg(crs)
Extract epsg information from crs object. If epsg info can be extracted directly from crs, return that value. Otherwise, try to convert the crs info to WKT2 and extract EPSG using regex
Note that this method may yield unreliable result
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
crs
|
CRS
|
crs object |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, bool]
|
tuple[int, bool]: epsg code and reliability flag |