-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.qmd
165 lines (110 loc) · 8.47 KB
/
README.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
title: "README"
format: gfm
---
# Introduction
This repository holds scripts for producing a prediction of the two-party preferred (TPP) outcomes at the electoral division level. It combines the poll-aggregation method similar to Jackman (2005) with multilevel regression and post-stratification (MRP).
The basic idea behind the former is that there is a latent voting intention, which evolves dynamically from week to week, that opinion polls measure, subject to some systematic bias specific the polling firm. Mathematically, we express this as
```math
\begin{align*}
\text{poll}_i &\sim N(\pi_{t[i]} + b_{p[i]}, \sigma_{\text{poll}}^2) \\
\pi_t &\sim N(\pi_{t-1}, \sigma_\pi^2) \\
b_p &\sim N(0, \sigma_b^2)
\end{align*}
```
where $\pi_{t[i]}$ denotes the intention at the time corresponding to poll $i$, and $b_{p[i]}$ the bias of the pollster corresponding to poll $i$.
This method provides us with a way to estimate the national swing. We simply project the voting intention forward to election day and then take the difference between our estimate and the previous election result. Thus, for instance, if the previous national TPP result was 49 and our projection is 51, the swing is 2.
We can then take this national swing and use it to produce a naive estimate of the result for each electoral division: we take the previous TPP result in that division and then add on the swing. Thus, if the previous result in a given division is 45 and the swing is 2, our naive estimate is 47.
We then pass this naive estimate through to a regression model, which we train on record level TPP voting intention data. For simplicity we use a linear probability model,
```math
\begin{align*}
y_i &\sim \text{Bernoulli}(\phi_i) \\
\phi_i &\sim N(\eta_i, \sigma_p^2) \\
\eta_i &= \alpha_{\text{division}[i]} +
\beta_{\text{age group}[i]} +
\gamma_{\text{sex}[i]} +
\delta_{\text{education level}[i]}
\end{align*}
```
where $y_i$ is binary variable representing TPP support for the ALP, $\alpha_{\text{division}[i]}$ represents our naive estimate for the division corresponding to record $i$ and the other terms represent secondary effects of demographic factors. For example, $\beta_{\text{age group}[i]}$ represents the effect of the age group corresponding to record $i$ on voting intention.
We use normal priors on each of the effect terms
```math
\begin{align*}
\alpha_d &\sim N(\tilde \phi_d, 5) \\
\beta_a &\sim N(0, 5) \\
\gamma_s &\sim N(0, 2) \\
\delta_e &\sim N(0, 5)
\end{align*}
```
where $\tilde \phi_d$ is the naive estimate for division $d$. The information these priors encode is roughly that we aren't very confident in our naive division estimate, we think the effect of sex is relatively small (a few points at most), but that the effect of age and education is could be large. See Biddle and McAllister (2022) for some information which vaguely justifies these choices.
We then use our trained model to estimate the probability of TPP support for the ALP in each cell of a table defined by cross-classfying the predictors (division, age group, sex, education level). Finally we can estimate the overall TPP support for the ALP in a division by taking a weighted mean across the cells of this table within each division. The weights are defined by the cell sizes.
Thus, for example, a single cell in this table might be `Adelaide-aged18to29-Female-bachelorDegree`. If there are 1200 people in this cell, as determined by Census data, then the weight for this cell is 1200. The effect of taking a weighted mean like this is that we compensate for certain imbalances in the record-level voting intention data. For example, we compensate for the fact that the sample includes fewer males than females.
# Results
The raw national level polls look like
```{r echo=FALSE, fig.align='center'}
knitr::include_graphics("plots/polls.png")
```
Adding our modeled latent intention over the top we get,
```{r echo=FALSE, fig.align='center'}
knitr::include_graphics("plots/tpp_walk.png")
```
As an auxilliary result, we also obtained estimates of the bias of each pollster
```{r echo=FALSE, fig.align='center'}
knitr::include_graphics("plots/tpp_bias.png")
```
Our estimates for electoral division TPP support are,
```{r echo=FALSE, fig.align='center'}
knitr::include_graphics("plots/estimates.png")
```
Here is a closer look at the (alphabetically) first 25,
```{r echo=FALSE, fig.align='center'}
knitr::include_graphics("plots/estimates_page1.png")
```
See `plots/estimates_page*.png` for similar plots of the other electorates.
We can use the raw estimates shown above to frame more easily comprehensible predictions to compare to the actual results. We use two methods. The first classes seats into one of 7 categories using the central estimate of TPP (ALP) support in each electorate,
```{r echo=FALSE}
my_flextable <- function(...) {
flextable::flextable(...) |>
flextable::bg(bg = "white", part = "all")
}
ps_pred1 <- readRDS("outputs/seat_predictionsv1.Rds")
table(ps_pred1$prediction, ps_pred1$result) |>
as.data.frame() |>
dplyr::select(prediction=Var1, result=Var2, n=Freq) |>
tidyr::pivot_wider(names_from = result, values_from = n, values_fill = 0) |>
my_flextable() |>
flextable::bg(i=1:3, j = 2:5, "lightgreen") |>
flextable::bg(i=5:7, j = 6:9, "lightgreen") |>
flextable::bg(i=1:3, j = 6:9, "orange") |>
flextable::bg(i=5:7, j = 2:5, "orange")
```
The second uses the estimated probability that the TPP (ALP) support in a given electorate is over 50%. It is also a little bit riskier insofar as it is more willing to call seats one way or another than the first approach,
```{r echo=FALSE}
ps_pred2 <- readRDS("outputs/seat_predictionsv2.Rds")
table(ps_pred2$prediction, ps_pred2$result) |>
as.data.frame() |>
dplyr::select(prediction=Var1, result=Var2, n=Freq) |>
tidyr::pivot_wider(names_from = result, values_from = n, values_fill = 0) |>
my_flextable() |>
flextable::bg(i=1:5, j = 2:5, "lightgreen") |>
flextable::bg(i=7:11, j = 6:9, "lightgreen") |>
flextable::bg(i=1:5, j = 6:9, "orange") |>
flextable::bg(i=7:11, j = 2:5, "orange")
```
For details, see the `predict.R` script.
As can be seen, both approaches yield similar results. The more conservative approach only fails catastrophically by predicting a win (loss) when in a seat that was lost (won) in two electorates. However it refuses to return any prediction for 14 seats. The riskier approach, by contrast, fails catastrophically in 3 electorates but only refuses to return a prediction for 8 seats.
Finally we can obtain a prediction for which way the election as a whole will fall by simulating the election many times over and counting the proportion of times the ALP wins a majority. Doing this we get,
```{r echo=FALSE, ft.align='center'}
ps_sim <- readRDS("outputs/election_simulations.Rds")
ps_sim |>
dplyr::summarise(n = sum(nseats > 75), N = dplyr::n(), p = 100*n/N) |>
my_flextable()
```
Thus we see that of our 10,000 simulations, the ALP manages to achieve a majority (i.e. win >75 seats) in under 40% of them. We would, therefore, predict an ALP loss in 2019.
Given that this is exactly the opposite of what one might guess looking at raw polls alone, and that 2019 was a notorious miss in terms of election predictions, with many analysts predicting an ALP win, we can feel very satisfied with this result.
# Diagnostics.
The regression component of the model fit fairly well, with the only issue being a tiny number (29/10000) of divergences. Having reviewed the pair plots and the trajectories that led to the divergences, I believe these are ignorable. They are probably a result of using a linear probability model, which, strictly speaking, is not correct.
The dynamic poll aggregation was somewhat less clean with a warning being raised about low effective sample sizes. I've added an issue on the GitHub repo about this and will return to it at some stage.
# Data
For the poll-of-polls step, our data is borrowed from the excellent [Australian Electoral Forecasts project](https://github.com/d-j-hirst/aus-polling-analyser).
For the MRP step, our poststratification table is derived from the ABS Census and the polling data is taken from a Life in Australia (LinA) panel conducted by the Social Research Centre. We take pre-processed versions of each from [Alexander et al.](https://github.com/RohanAlexander/ForecastingMultiDistrictElections). We note that the LinA panel was conducted between 08/04/2019 and 26/04/2019. It is not included in the polling data that we used to produce the poll-of-polls.