-
Notifications
You must be signed in to change notification settings - Fork 15
/
Copy path02-spatial-data.Rmd
99 lines (69 loc) · 4.7 KB
/
02-spatial-data.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
title: "Spatial Data"
author: "Francisco Rowe ([`@fcorowe`](http://twitter.com/fcorowe))"
date: "`r Sys.Date()`"
output: tint::tintHtml
bibliography: skeleton.bib
link-citations: yes
---
```{r setup, include=FALSE}
library(sf)
library(tint)
# invalidate cache when the package version changes
knitr::opts_chunk$set(tidy = FALSE, cache.extra = packageVersion('tint'), class.source = "col-source")
options(htmltools.dir.version = FALSE)
```
```{css, echo=FALSE}
.col-source {
background-color: #E5E7E9;
border: 3px #000000;
}
```
```{marginfigure}
[**Back**](01-gds.html) \
[**Next**](03-spatial_weights.html)
```
# Fundamental Geographic Data Structures
Three main structures are generally used to organise geographic data:
1. [Vector data structure]{.underline}: The vector data structures record geographic information using points, lines and polygons in a geographic table. These tables contain information about geographic objects. Columns store information about geographic objects, attributes or features, and rows represent individual geographic objects.
2. [Raster data structures]{.underline}: The raster data structures record geographic data in an uniform way over a space in the form of grids. It divides geographic surfaces up into cells of constant size. Rows and columns provide information about the geographic location of a grid.
3. [Spatial graphs]{.underline}: Spatial graphs store connections between objects through space. These connections may derive from geographical topology (e.g. contiguity), distance, or more sophisticated dimensions, such as interaction flows (e.g. human mobility, trade and information).
Vector data structures tend to dominate the social sciences are the interest is often in capturing discrete geographic units containing populations. Here therefore we focus on vector data structures.
## Vector data
To understand the structure of vector data, let's read a dataset (`Liverpool_OA.shp`) describing output areas within Liverpool in the United Kingdom. To read in the data, we use the `st_read()` from the package `sf`. `sf` supports geometry collections, which can contain multiple geometry types in a single object. `sf` provides the same functionality previously provided in three separate packages `sp`, `rgdal` and `rgeos` (Robin et al. 2021). `sf` can also be used in combination with `tidyverse`!
Reading the data set via `sf` returns its geographic metadata (i.e. `Geometry type`, `Dimension`, `Bounding box` and coordinate reference system information on the line beginning `Projected CRS`).
```{marginfigure}
For raster data, I would recommend using the package `terra`.
```
```{marginfigure}
If you are interested in learning more about mapping geographic data, I cannot recommend enough: Lovelace, R., Nowosad, J. and Muenchow, J., 2019. "*Geocomputation with R*". Chapman and Hall/CRC.
```
```{r}
oa_shp <- st_read("./data/Liverpool_OA.shp")
```
We read a `sf` data frame containing spatial and attribute columns. We can examine the content of the data frame by using the function `head()`. We called the first four columns. The last column in this example contains the geographic information i.e. `geometry`.
```{r}
class(oa_shp)
head(oa_shp[,1:4])
```
Each row represents an output area. Each output area has multiple attributes (i.e. columns): administrative areas codes and geometry, as well as information on the local population in these areas; however, this information is not displayed above (can you access it?).
The content of the geometry column gives `sf` objects their spatial powers. `oa_shp$geometry` is a 'list column' that contains all the coordinates of the output areas polygons. `sf` objects can be plotted quickly with the base R function `plot()`.
```{marginfigure}
For more advanced map making, use dedicated visualisation packages such as `tmap` or `ggplot2`.
```
```{r}
plot(oa_shp$geometry)
```
We can thematically colour any attributes in the spatial data frame based on a column by passing the name of that column to the plot function. We map the share of unemployed population. We can adjust the key or legend position (`key.pos`), plot axes (`axes`), length of the scale bar (`key.length`), thickness/width of the scale bar (`key.width`), method or number to break the data attribute (`breaks`), line width (`lwd`) and colour of polygon borders (`border`).
```{r}
plot(oa_shp["unemp"], key.pos = 4, axes = TRUE, key.width = lcm(1.3), key.length = 1., breaks = "jenks", lwd = 0.1, border = 'grey')
```
Various types of geometries (i.e. lines, points and polygons) exist. We can transform vector data into points by running:
```{r, warning=FALSE}
oa_cents = st_centroid(oa_shp)
head(oa_cents[,1:4])
```
And visualise the data by running:
```{r}
plot(st_geometry(oa_cents))
```