Vector - Joined Attributes
In many spatial applications, geometry information is stored separately from attributes, typically in different tables. A join operation is performed at runtime to combine the two datasets. To simplify this workflow, we assume:
- Geometry information is extracted and stored in a vector file.
- Attributes are extracted and stored in a CSV file.
The STAC generator can describe this join operation by including a few additional keywords in the configuration.
Generic Join Asset
For this example, we assume the vector file Werribee.geojson has an accompanied attribute table stored in the file distance.csv.
| Area | Distance | Public_Transport | Drive | Growth | Yield |
|---|---|---|---|---|---|
| Point Cook | 27.3 | 68 | 50 | 0.046 | 0.04 |
| Hoppers Crossing | 30.8 | 67 | 55 | 0.0402 | 0.041 |
| Werribee | 31.3 | 56 | 50 | 0.0458 | 0.042 |
| Werribee South | 37.3 | 80 | 55 | -0.01 | 0.04 |
| Wyndham Vale | 37.5 | 83 | 60 | 0.0458 | 0.043 |
| Altona Meadows | 23.7 | 61 | 45 | 0.0422 | 0.037 |
| Tarneit | 29.7 | 70 | 55 | 0.0455 | 0.044 |
The asset contains some attributes associated with different suburbs near Werribee. The values of Area in the join asset have a 1-to-1 correspondence with the values of the attribute Suburb_Name in the vector asset.
Config
Explanation of fields
The first part of the config is similar to that of the previous tutorial, in which we describe the minimum required fields and the vector's attributes. We also specify the following additional fields:
join_config: contains metadata for the join assetfile: path to the join assetleft_on: attribute from the vector that will be used for the join operation.right_on: attribute from the join asset that will be used for the join operation.column_info: attributes of the join asset.
The join terminologies that we use are consistent with pandas' merge operation's, in which the vector geometry is treated as the left dataframe, while the join asset the right dataframe. The join operation is inner left join, where rows with matching values of left_on and right_on are merged. Note that the field left_on must be described in the vector's column_info while right_on described in the join asset's column_info. If either of those fields are not described appropriately, an error will be raised.
Command and Output
Save the config as vector_join_config.json and run the following command:
You should see the corresponding fields appearing under properties in Werribee.json.
Timeseries Join Asset
In this example, we will use the join asset price.csv as attributes for the vector file Werribee.geojson. The asset file is presented as follows:
| Date | Area | Sell_Price | Rent_Price | Sell/Rent |
|---|---|---|---|---|
| 2020-01-01T00:00:00Z | Point Cook | 630 | 410 | 1.53659 |
| 2024-01-01T00:00:00Z | Point Cook | 750 | 530 | 1.41509 |
| 2025-01-01T00:00:00Z | Point Cook | 750 | 560 | 1.33929 |
| 2020-01-01T00:00:00Z | Altona Meadow | 622 | 375 | 1.65867 |
| 2024-01-01T00:00:00Z | Altona Meadow | 727 | 450 | 1.61556 |
| 2025-01-01T00:00:00Z | Altona Meadow | 725 | 500 | 1.45 |
| 2020-01-01T00:00:00Z | Tarneit | 615 | 390 | 1.57692 |
| 2024-01-01T00:00:00Z | Tarneit | 690 | 460 | 1.5 |
| 2025-01-01T00:00:00Z | Tarneit | 700 | 500 | 1.4 |
| 2020-01-01T00:00:00Z | Hoppers Crossing | 510 | 350 | 1.45714 |
| 2024-01-01T00:00:00Z | Hoppers Crossing | 592 | 420 | 1.40952 |
| 2025-01-01T00:00:00Z | Hoppers Crossing | 600 | 450 | 1.33333 |
| 2020-01-01T00:00:00Z | Werribee | 475 | 345 | 1.37681 |
| 2024-01-01T00:00:00Z | Werribee | 562 | 400 | 1.405 |
| 2025-01-01T00:00:00Z | Werribee | 580 | 450 | 1.28889 |
| 2020-01-01T00:00:00Z | Werribee South | 628 | 385 | 1.63117 |
| 2024-01-01T00:00:00Z | Werribee South | 870 | 430 | 2.02326 |
| 2025-01-01T00:00:00Z | Werribee South | 595 | 440 | 1.35227 |
| 2020-01-01T00:00:00Z | Wyndham Vale | 448 | 340 | 1.31765 |
| 2024-01-01T00:00:00Z | Wyndham Vale | 530 | 410 | 1.29268 |
| 2025-01-01T00:00:00Z | Wyndham Vale | 532 | 440 | 1.20909 |
The asset contains the sale and rental prices of various surburbs in Werribee over three different time periods 2020, 2024 and 2025. Similarly, the attribute Area of the join asset is used to perform the join operation with the attribute Suburb_Name of the vector asset.
Config
Field Explanation
The config uses the same set of fields as the config for generic join asset. The additional keyword is:
date_column: describes the attribute in the csv to be used as timestamps. Fromprice.csv, this columns isDate.
By default, date values in date_column will be parsed using ISO8601 date format. If date values are encoded with a custom format, the format can be provided using the field date_format. The date formats follows python's strptime formats. Note that if the date column cannot be found in the asset or date values cannot be parsed (either using ISO8601 or date_format value if provided), the program will raise an error.
Command and Output
Save the config as vector_join_date_config.json and run the following command:
The output should contain the specified fields.