Skip to content

Utility

parse_href(base_url, collection_id, item_id=None)

Generate href for collection or item based on id. This is used for generating STAC API URL.

Parameters:

Name Type Description Default
base_url str

base url

required
collection_id str

collection id

required
item_id str | None

item's id. Defaults to None.

None

Returns:

Name Type Description
str str

built url

href_is_stac_api_endpoint(href)

Check if href points to a resource behind a stac api

Parameters:

Name Type Description Default
href str

url

required

Returns:

Name Type Description
bool bool

boolean result

force_write_to_stac_api(url, id, json)

Force write a json object to a stac api endpoint.

Initially try to POST the json. If 409 error encountered, will try a PUT.

Parameters:

Name Type Description Default
url str

endpoint url

required
id str

collection's id

required
json dict[str, Any]

json body

required

Raises:

Type Description
err

error encountered other than integrity error

read_source_config(href)

Read in config from location

Parameters:

Name Type Description Default
href str

config location

required

Raises:

Type Description
InvalidExtensionException

if an unrecognised extension is provided. Only accepts json, yaml, yml, csv

StacConfigException

if the config file cannot be read

ConfigFormatException

if the config is not a dictionary or a list

Returns:

Type Description
list[dict[str, Any]]

list[dict[str, Any]]: list of raw configs as dictionaries.

calculate_timezone(geometry)

Method to calculate timezone string from a geometry or a sequence of geometries

If a sequence of geometries is provided, the timezone is provided for the centroid of the sequence of geometries.

Parameters:

Name Type Description Default
geometry Geometry | Sequence[Geometry]

geometry object

required

Raises:

Type Description
TimezoneException

if timezone cannot be determined from geometry

Returns:

Name Type Description
str str

timezone string

get_timezone(timezone, geometry)

Get timezone string based on provided timezone option and geometry.

This invokes the calculate_timezone method under the hood if appropriate.

Parameters:

Name Type Description Default
timezone str | Literal["local", "utc"]

timezone parameter from SourceConfig

required
geometry Geometry | Sequence[Geometry]

asset's geometry.

required

Returns:

Name Type Description
str str

timezone string

localise_timezone(data, tzinfo)

localise_timezone(
    data: Timestamp, tzinfo: str
) -> Timestamp
localise_timezone(
    data: TimeSeries, tzinfo: str
) -> TimeSeries

Add timezone information to data then converts to UTC

Parameters:

Name Type Description Default
data Timestamp | TimeSeries

series of timestamps or a single timestamp

required
tzinfo str

parsed timezone

required

Raises:

Type Description
TimezoneException

an invalid timezone is provided

Returns:

Type Description
Timestamp | TimeSeries

Timestamp | TimeSeries: utc localised timestamp

is_string_convertible(value)

Check whether value is string or path

If value is Path, converts to string via as_posix

Parameters:

Name Type Description Default
value Any

input value

required

Raises:

Type Description
ValueError

is not a string or Path

Returns:

Name Type Description
str str

string path

read_point_asset(src_path, X_coord, Y_coord, epsg, Z_coord=None, T_coord=None, date_format='ISO8601', columns=None, timezone='local')

Read in point data from disk or remote

Users must provide at the bare minimum the location of the csv, and the names of the columns to be treated as the X and Y coordinates. By default, will read in all columns in the csv. If columns and groupby columns are provided, will selectively read specified columns together with the coordinate columns (X, Y, T).

Timezone information is used to convert all timestamps to timezone-aware timestamps. Timestamps that are originally timezone awared will not be affected. Timestamps that are originally non-timezone awared will be embeded with timezone information. Timestamps are subsequently converted to UTC.

Parameters:

Name Type Description Default
src_path str

source location

required
X_coord str

column to be treated as the x_coordinate

required
Y_coord str

column to be treated as the y coordinate

required
epsg int

epsg code

required
Z_coord str | None

column to be treated as the z coordinate. Defaults to None.

None
T_coord str | None

column to be treated as timestamps. Defaults to None.

None
date_format str

date intepretation method. Defaults to "ISO8601".

'ISO8601'
columns set[str] | set[ColumnInfo] | Sequence[str] | Sequence[ColumnInfo] | None

columns to be read from the point asset. Defaults to None.

None
timezone str | Literal["utc", "local"]

timezone parameter for embedding non-timezone-aware timestamps. Defaults to "local".

'local'

Returns:

Type Description
gpd.GeoDataFrame

gpd.GeoDataFrame: read dataframe

read_vector_asset(src_path, bbox=None, columns=None, layer=None)

Read in vector asset from disk or remote.

Users can provide an optional bbox for constraining the region of the vector data to be read, a set of columns describing the attributes of interest, and a layer parameter if the asset is a multilayered vector asset.

Parameters:

Name Type Description Default
src_path str | Path

path to asset.

required
bbox tuple[float, float, float, float] | None

bbox to define the region of interest. Defaults to None.

None
columns set[str] | Sequence[str] | None

sequence of columns to be read from the vector file. Defaults to None.

None
layer str | int | None

layer indentifier for a multilayered asset. Defaults to None.

None

Raises:

Type Description
StacConfigException

if the provided layer is non-existent

SourceAssetException

if the asset cannot be accessed or is malformatted

Returns:

Type Description
gpd.GeoDataFrame

gpd.GeoDataFrame: read dataframe

read_join_asset(src_path, right_on, date_format, date_column, columns, tzinfo)

Read the join asset from disk or remote

Parameters:

Name Type Description Default
src_path str

path to join asset

required
right_on str

right on attribute from join config

required
date_format str

date format from join config

required
date_column str | None

date column from join config

required
columns set[str] | Sequence[str] | set[ColumnInfo] | Sequence[ColumnInfo]

list of columns to be read in from the asset

required
tzinfo str

timezone information - already parsed using get_timezone

required

Returns:

Type Description
pd.DataFrame

pd.DataFrame: description

extract_epsg(crs)

Extract epsg information from crs object. If epsg info can be extracted directly from crs, return that value. Otherwise, try to convert the crs info to WKT2 and extract EPSG using regex

Note that this method may yield unreliable result

Parameters:

Name Type Description Default
crs CRS

crs object

required

Returns:

Type Description
tuple[int, bool]

tuple[int, bool]: epsg code and reliability flag