-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathAssignment4-TuWenjie.Rmd
422 lines (337 loc) · 15.6 KB
/
Assignment4-TuWenjie.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
---
title: "Assignment 4"
author: "Wenjie Tu"
date: "Spring Semester 2022"
output:
html_document:
toc: true
toc_depth: 5
toc_float: true
subtitle: Bayesian Data Analysis and Models of Behavior
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
Sys.setenv(lang="us_en")
rm(list=ls())
```
The goal of this assignment is to verify the implicit assumption in Pedersen et al. (2017) that the DDM will yield more precise estimates of the learning rate parameters than the softmax choice rule (i.e., logistic regression) typically used in the studies of reward learning.
$~$
### Part 1
$Q_i(t)$ is calculated by summing the reward expectation from the previous trial and the reward prediction error (PE):
$$
Q_i(t)=Q_i(t-1)+\eta
\underbrace{[\text{Reward}_i(t-1)-Q_i(t-1)]}_\text{prediction error (PE)}
$$
* $Q_i(t)$: the reward value expectation for the chosen option $i$ on trial $t$.
* $\eta$: the learning rate parameter.
<!-- The process of choosing between options can be described by the softmax choice rule: -->
<!-- $$ -->
<!-- p_i(t)=\frac{\exp(\beta(t)\times V_i(t))} -->
<!-- {\sum_{j=1}^{n}\exp(\beta(t)\times V_j(t))} -->
<!-- $$ -->
<!-- * $p_i(t)$: the probability that a decision maker will choose one option $i$ among all options $j$. -->
<!-- * $\beta$: the parameter that governs the sensitivity to rewards and the exploration-exploitation trade-off. -->
<!-- The DDM calculates the likelihood of the reaction time (RT) as a choice $x$ with the Wiener first-passage time (WFPT) distribution: -->
<!-- $$ -->
<!-- RT(x)\sim \text{WFPT}[a, T_\text{er},z,\nu(t)] -->
<!-- $$ -->
<!-- * The WFPT returns the probability that $x$ is chosen with the observed RT. -->
<!-- * $T_\text{er}$: the nondecision time. -->
<!-- * $z$: starting point. -->
<!-- * $a$: boundary separation (trial-independent free parameters). -->
<!-- * $\nu(t)$: drift rate. -->
The drift rate $\nu(t)$ varies from trial to trial as a function of the difference in the expected rewards, multiplied by a scaling parameter $m$, which can capture differences in the ability to use knowledge of the reward probabilities:
$$
\nu(t)=[Q_\text{upper}(t)-Q_\text{lower}(t)]\times m
$$
* $Q_\text{upper}(t)$ and $Q_\text{lower}(t)$ represent the reward expectations for the two response options.
The process of choosing between options can be described by the sigmoid (ilogit) function:
$$
p_i(t)=\frac{1}{1+\exp\{-(\beta(t)+\nu(t))\}}
$$
* $p_i(t)$: the probability that a decision maker will choose one option $i$ among all options $j$.
* $\beta$: the parameter that governs the sensitivity to rewards and the exploration-exploitation trade-off.
#### Summary table
```{r, warning=FALSE, message=FALSE}
library(runjags)
library(coda)
library(rjags)
library(ggplot2)
library(HDInterval)
library(knitr)
```
```{r}
# Read in data
dat <- read.csv("./data/data.csv")
# Look at the structure of the data
str(dat)
```
```{r}
modelString <- "model {
# Prior for the learning rate parameter eta
eta_g_mu ~ dunif(0, 1)
eta_g_sig ~ dunif(0.001, 5)
# Prior for the drift rate saling parameter m
m_g_mu ~ dunif(1, 19)
m_g_sig ~ dunif(0.001, 5)
# Prior for the sensitivity parameter beta
beta_g_mu ~ dunif(0, 1)
beta_g_sig ~ dunif(0.001, 5)
# Transform sigma into precision
eta_g_tau <- 1 / eta_g_sig^2
m_g_tau <- 1 / m_g_sig^2
beta_g_tau <- 1 / beta_g_sig^2
# Group parameters
for (g in 1:G) {
# Dual learning rates
for (v in 1:2) {
eta_g[g, v] ~ dnorm(eta_g_mu, eta_g_tau)T(0, 1) # Truncation
eta_sig[g, v] ~ dunif(0.001, 5)
eta_tau[g, v] <- 1 / eta_sig[g, v]^2
}
m_g[g] ~ dnorm(m_g_mu, m_g_tau)
m_sig[g] ~ dunif(0.001, 5)
m_tau[g] <- 1 / m_sig[g]^2
beta_g[g] ~ dnorm(beta_g_mu, beta_g_tau)
beta_sig[g] ~ dunif(0.001, 5)
beta_tau[g] <- 1 / beta_sig[g]^2
}
# Subject parameters
for (g in 1:G) {
for (s in 1:S) {
for (v in 1:2) {
eta[g, s, v] ~ dnorm(eta_g[g, v], eta_tau[g, v])T(0, 1) # Truncation
}
beta[g, s] ~ dnorm(beta_g[g], beta_tau[g])
m[g, s] ~ dnorm(m_g[g], m_tau[g])
}
}
# Loop over trials for each group and subject
for (g in 1:G) {
for (s in 1:S) {
# Assign starting values to q[group,trial,stimulus pair,option].
# 'first' is a two-dimensional-array identifying first trial for each subject in each group.
for (stim_pair in 1:3) {
q[g, first[s, g], stim_pair, 1] <- 0
q[g, first[s, g], stim_pair, 2] <- 0
}
# Run through trials
# 'last' is a two-dimensional-array identifying last trial for each subject in each group.
for (trial in (first[s, g]):(last[s, g]-1)) {
# Calculate drift rate parameter as q-delta multiplied by m
v[trial] <- ( q[g, trial, pair[trial], 1] - q[g, trial, pair[trial], 2] ) * m[g, s]
pr[trial] <- ilogit(beta[g,s] + v[trial])
rt_binary[trial] ~ dbern(pr[trial])
rt_predict[trial] ~ dbern(pr[trial])
# Update q-values for next trial.
# 'pair' identifies the stimulus pair in the current trial 'not1' and 'not2' identifies the other stimulus pairs.
q[g, trial+1, pair[trial], choice[trial]] <- q[g, trial, pair[trial], choice[trial]] +
eta[g, s, valence[trial]] * (value[trial] - q[g, trial, pair[trial], choice[trial]])
q[g, trial+1, pair[trial], nonchoice[trial]] <- q[g, trial, pair[trial], nonchoice[trial]]
q[g, trial+1, not1[trial], 1] <- q[g, trial, not1[trial], 1]
q[g, trial+1, not1[trial], 2] <- q[g, trial, not1[trial], 2]
q[g, trial+1, not2[trial], 1] <- q[g, trial, not2[trial], 1]
q[g, trial+1, not2[trial], 2] <- q[g, trial, not2[trial], 2]
}
# Q-values are not updated in last trial
for (trial in last[s, g]) {
v[trial] <- (q[g, trial, pair[trial], 1] - q[g, trial, pair[trial], 2]) * m[g, s]
# following the paper, beta is modeled using the power function and depends on t
pr[trial] <- ilogit(beta[g, s] + v[trial])
rt_binary[trial] ~ dbern(pr[trial])
rt_predict[trial] ~ dbern(pr[trial])
}
}
}
}"
writeLines(modelString, con="./models/logistic.txt")
```
```{r}
dat$rt_binary <- ifelse(dat$rt >= 0, 1, 0)
niters <- 120 # max "trial" number for each subject (true max = 120)
dat <- dat[dat$iter <= niters, ]
nsubs <- 17 # max number of subjects (true max = 17)
dat <- dat[dat$ord_sbj <= nsubs, ]
dat$rownum <- 1:nrow(dat)
first <- aggregate(rownum ~ ord_sbj + med, data = dat, min)
first <- matrix(first$rownum, ncol = 2, nrow = nsubs)
last <- aggregate(rownum ~ ord_sbj + med, data = dat, max)
last <- matrix(last$rownum, ncol = 2, nrow = nsubs)
dat.jags <- dump.format(list(choice = dat$choice,
nonchoice = dat$nonchoice,
value = dat$value,
rt_binary = dat$rt_binary,
pair = dat$pair,
valence = dat$valence,
S = length(unique(dat$ord_sbj)),
first = as.matrix(first),
last = as.matrix(last),
not1 = dat$not_cond1,
not2 = dat$not_cond2,
G = 2))
inits1 <- dump.format(list(.RNG.name="base::Super-Duper", .RNG.seed=99999 ))
inits2 <- dump.format(list(.RNG.name="base::Wichmann-Hill", .RNG.seed=1234 ))
inits3 <- dump.format(list(.RNG.name="base::Mersenne-Twister", .RNG.seed=6666 ))
n.burnin <- 40000
n.sample <- 2000
n.thin <- 5
# Tell JAGS which latent variables to monitor
params <- c("eta_g", "eta", "m_g", "m", "beta_g", "beta")
```
```{r}
# Run the function that fits the models using JAGS
start.time <- Sys.time()
results.log <- run.jags(model = "./models/logistic.txt",
monitor = params,
data = dat.jags,
inits = c(inits1, inits2, inits3), #, inits4),
plots = FALSE,
n.chains = 3,
burnin = n.burnin,
sample = n.sample,
thin = n.thin,
method = c("parallel"))
end.time <- Sys.time()
```
```{r}
end.time - start.time
```
```{r}
kable(summary(results.log), digits=6, caption="Summary results for the logistic regression")
```
#### Density plot for learning rates
```{r}
chains.log <- data.frame(rbind(results.log$mcmc[[1]], results.log$mcmc[[2]], results.log$mcmc[[3]]))
# Select columns for group-level learning rate parameters
chains.log <- chains.log[, 1:4]
column.names <- c("OFF.negative", "ON.negative", "OFF.positive", "ON.positive")
colnames(chains.log) <- column.names
```
```{r}
d.plot <- data.frame(
eta_g = c(chains.log$OFF.negative,
chains.log$ON.negative,
chains.log$OFF.positive,
chains.log$ON.positive),
Group = rep(column.names, each=nrow(chains.log))
)
```
```{r, out.width='100%'}
ggplot(d.plot, aes(x=eta_g, y=..density.., color=Group, fill=Group)) +
geom_density(alpha=0.2) +
labs(title="Density plot of four group-level learning rate parameters", x=expression(eta[g]), y="Density") +
theme_minimal()
```
#### Density plot for differences
```{r, fig.show='hold', out.width='50%'}
# Histogram
ggplot(chains.log, aes(x=ON.positive - OFF.positive, y=..density..)) +
geom_histogram(color=3, fill=3, alpha=0.2, bins=30) +
geom_vline(xintercept=hdi(chains.log$ON.positive - chains.log$OFF.positive), color=2, lty="longdash") +
labs(title="Positive learning rate (ON-OFF)", x=expression(eta^"+" ~ "(ON-OFF)"), y="Density") +
theme_minimal()
# Density plot
ggplot(chains.log, aes(x=ON.positive - OFF.positive, y=..density..)) +
geom_density(color=3, fill=3, alpha=0.2) +
geom_vline(xintercept=hdi(chains.log$ON.positive - chains.log$OFF.positive), color=2, lty="longdash") +
labs(title="Positive learning rate (ON-OFF)", x=expression(eta^"+" ~ "(ON-OFF)"), y="Density") +
theme_minimal()
```
```{r, fig.show='hold', out.width='50%'}
# Histogram
ggplot(chains.log, aes(x=ON.negative - OFF.negative, y=..density..)) +
geom_histogram(color=4, fill=4, alpha=0.2, bins=30) +
geom_vline(xintercept=hdi(chains.log$ON.negative - chains.log$OFF.negative), color=2, lty="longdash") +
labs(title="Negative learning rate (ON-OFF)", x=expression(eta^"-" ~ "(ON-OFF)"), y="Density") +
theme_minimal()
# Density plot
ggplot(chains.log, aes(x=ON.negative - OFF.negative, y=..density..)) +
geom_density(color=4, fill=4, alpha=0.2) +
geom_vline(xintercept=hdi(chains.log$ON.negative - chains.log$OFF.negative), color=2, lty="longdash") +
labs(title="Negative learning rate (ON-OFF)", x=expression(eta^"-" ~ "(ON-OFF)"), y="Density") +
theme_minimal()
```
#### Interpretation
The posterior distribution of differences for attention-deficit-hyper-activity disorder (ADHD) subjects on vs. off stimulant medication for positive learning rate is consistent with what is shown in the paper. Both are left-skewed. However, the posterior distribution of differences for ADHD subjects on vs. off stimulant medication for negative learning rate seems not consistent well with what is presented in the paper in terms of skewness, which is probably due to the randomness in the algorithm. Since initial values are not pre-defined for each chain and JAGS initializes chains automatically, the results are not reproducible even we specify the random number generators and random number seeds. Furthermore, the 95% HDI for the difference between ON and OFF medication for the negative learning rate is wider than what is presented in the paper.
$~$
### Part 2
#### Summary table
```{r}
model.sim <- extend.jags(results.log,
drop.monitor = results.log$monitor,
add.monitor = c("rt_predict"),
sample = 1,
adapt = 0,
burnin = 0,
summarise = FALSE,
method = "parallel")
```
```{r}
newdat.jags <- dump.format(list(choice = dat$choice,
nonchoice = dat$nonchoice,
value = dat$value,
rt_binary = as.integer(model.sim$mcmc[[1]]),
pair = dat$pair,
valence = dat$valence,
S = length(unique(dat$ord_sbj)),
first = as.matrix(first),
last = as.matrix(last),
not1 = dat$not_cond1,
not2 = dat$not_cond2,
G = 2))
n.burnin <- 40000
n.sample <- 2000
n.thin <- 5
# Tell JAGS which latent variables to monitor
params <- c("eta_g", "eta", "m_g", "m", "beta_g", "beta")
```
```{r}
start.time <- Sys.time()
results.rec <- run.jags(model = "./models/logistic.txt",
monitor = params,
data = newdat.jags,
inits = c(inits1, inits2, inits3),
plots = FALSE,
n.chains = 3,
burnin = n.burnin,
sample = n.sample,
thin = n.thin,
method = c("parallel"))
end.time <- Sys.time()
```
```{r}
end.time - start.time
```
```{r}
kable(summary(results.rec), digits=6, caption="Summary results for the logistic regression (simulated data)")
```
#### Boxplots
```{r}
chains.rec <- data.frame(rbind(results.rec$mcmc[[1]], results.rec$mcmc[[2]], results.rec$mcmc[[3]]))
# Select columns for group-level learning rate parameters
chains.rec <- chains.rec[, 1:4]
column.names <- c("OFF.negative", "ON.negative", "OFF.positive", "ON.positive")
colnames(chains.rec) <- column.names
```
```{r}
d.recovery <- data.frame(rbind(chains.log, chains.rec),
Parameters = rep(c("Logistic", "Recovered"), each=nrow(chains.log)))
```
```{r, out.width='100%'}
ggplot(d.recovery, aes(x=Parameters, y=OFF.negative, color=Parameters)) + geom_boxplot() +
labs(title="Negative learning rate for the OFF medication group", x=expression(eta^"-"~"(OFF)")) + theme_minimal()
```
```{r, out.width='100%'}
ggplot(d.recovery, aes(x=Parameters, y=ON.negative, color=Parameters)) + geom_boxplot() +
labs(title="Negative learning rate for the ON medication group", x=expression(eta^"-"~"(ON)")) + theme_minimal()
```
```{r, out.width='100%'}
ggplot(d.recovery, aes(x=Parameters, y=OFF.positive, color=Parameters)) + geom_boxplot() +
labs(title="Positive learning rate for the OFF medication group", x=expression(eta^"+"~"(OFF)")) + theme_minimal()
```
```{r, out.width='100%'}
ggplot(d.recovery, aes(x=Parameters, y=ON.positive, color=Parameters)) + geom_boxplot() +
labs(title="Positive learning rate for the ON medication group", x=expression(eta^"+"~"(ON)")) + theme_minimal()
```
#### Interpretation
We present four boxplots of the posterior chains for four group-level learning rates (ON-positive, ON-negative, OFF-positive, OFF-negative) from both logistic regression results (Part 1) and parameter recovery results. We see that significant differences between the parameter recovery results and the logistic regression results for four group-level learning rates. In particular, such difference is even larger for the On medication group with positive learning rate. We therefore conclude that the decision-making process is better characterized by the DDM. This also confirms the assumption that the DDM outperforms the rewarding model using softmax choice rule in terms of learning parameter estimation.