Please follow all the instructions below to create a hexagon data file that can be used by the spider app.
Key steps are noted with a 👉.
First clone this repository:
git clone https://github.com/carderne/ccg-spider.git
cd ccg-spider/prep
The instructions below are for the two most important layers. Please read DATA.md for some instructions for downloading additional layers (such as population, settlements, OSM data).
To keep things organised, use the following data directories:
- raw/: for raw data downloaded from the internet.
- data/: for the processed data that is ready to be ingest by the script.
- processed/: for the output hexagon files.
The only compulsory input layer is an AOI (area of interest) delineating the area to be modelled. Choose whether you want an administrative boundary or just to draw a custom AOI.
👉 Whichever one you use (administrative or custom) save the resulting file in the data directory as eg data/aoi.gpkg
(or data/aoi.geojson
).
Download a GADM extract and export only level-0 as GeoPackage
.
You can also export a different level if you want, for example, a specific administrative region within a country.
You can also just go to geojson.io and draw a polygon covering the area and download it as a GeoJSON.
If you plan to calculate any distance layers (eg distance from hexagon to grid lines) you'll need a blank raster to use as a template.
You can create this as follows (replace data/aoi.gpkg
with whatever you named the AOI fro above).
gdal_rasterize data/aoi.gpkg -burn 1 -tr 0.1 0.1 data/blank.tif
gdalwarp -t_srs EPSG:4088 blank.tif data/blank_proj.tif
👉 Make sure you're in the ccg-spider/prep
directory and install it
pip install -e .
👉 You must make a file called config.yml
(inside the prep
folder)
with at least the following contents:
(You can look at the complete example at config_example.yml to get some ideas.)
aoi: data/aoi.gpkg # path to the AOI you downloaded
hex_res: 3 # see note below!
raster_like: data/blank_proj.tif # path to the template raster you created
Have a look here
to see details on the resolutions available.
A good choice for a medium sized country is 5
, or 4
for a very big one.
If you're doing a small area around a town, 8
might be more useful.
I recommend to start with 3
, check that everything works, and then go to higher resolutions.
(It will be much faster this way!)
👉 Once that is all done, you can run the script. Note that the output must be a GeoJSON or it will complain.
spi processed/hex.geojson
And then you can move this GeoJSON to the appropriate location at frontend/dist/models/model-name/hex.geojson
.
You can also point the script to a different config file:
spi --config=other_config.yml processed/other_hex.geojson
If you run spi
and provide a path to a file that already exists, it will overwrite that file. If you want, you can run the script as below, and then only new columns from config.yml
will be added (this might be risky).
spi processed/hex.geojson --append
I'd recommend to try it just like that to make sure everything works.
But once you've successfully run the scripts (see above), you need to add some more layers.
So follow these instructions, and then run the spi
script again.
Have a look at config_example.yml to see a full example of how you can add layers. Basically, it looks like this:
# ignoring the aoi, hex_res stuff at the top
features: # this must only appear once!
# just add more blocks like the below
- name: # name the column will get
type: # column or vector
file: # source file containing the data to be extracted
operation: # either sum/mean (raster) or distance/sjoin (vector)
decimals: # number of decimal places to keep
joined_col: # this is ONLY used for sjoin!
fix: # this and below are all optional
minimum: # minimum value to apply
factor: # constant to multipl by (eg m -> km)
These are the simplest, and use zonal statistics to get data from a raster into your hexagons.
Example with sum
:
- name: pop
type: raster
file: data/hrsl.tif
operation: sum
decimals: 0
Example with mean
:
- name: precip
type: raster
file: data/precip.tif
operation: mean
decimals: 0
fix:
minimum: 0
There are currently two ways to use vector layers.
You can calculate every hexagons distance
to some features:
- name: grid_dist
type: vector
operation: distance
file: data/grid.gpkg
decimals: 2
fix:
factor: 0.001
Or you can do a spatial join to extract information from overlapping features. This example gets the name of the province that each hexagon is within.
- name: adm1
type: vector
operation: sjoin
file: data/gadm1.gpkg
joined_col: NAME_1 # this is the name of the column
# that you want from the vector file