20211019---glo_2102.Rmd

---
title: "glo_2102"
output:
  html_document:
    df_print: paged
---

Two groups of children, Grade 1 and Grade 6, completed simple sums with operands from 0 to 9. The aim is to see the evolution of problem size effect in tie vs. non-tie problems in this cross sectional design.

Predictions should be similar to Bagnoud et al. (2021).

***Variables:***

**VI**: group (Grade 1 vs. Grade 6)
  - Grade 1: 31 kids (32 evaluated, number 6 eliminated because he did not finish the task); 
  
  28 edat2 file missing: eliminado tamibén?
  
  
  - Grade 6: 28 kids (29 evaluated, number 14 eliminated due to technical issues)
  
**VD**: problem size in tie and non-tie problems

***We can divide the sums as follows:*** 

***1. Bagnoud et al. (2021):***

**- rule-based problems**:

- 0-problems (n+0/0+n) #esto lo tenemos extra, en Bagnoud et al. (2021) no consideraron ese tipo de problemas
- 1-problems (n+1/1+n)

**- tie problems**:  

- small tie problems (2+2, 3+3, 4+4)
- large tie problems (5+5, 6+6, 7+7, 8+8, 9+9)

**- non-tie problems**:  

- small problems: operands smaller or equal to 4 (i.e., 2+2, 2+4, 3+4, 3+2, 4+2, 4+3) 
- medium problems: at least one operand higher 4 and sum smaller or equal to 10 (i.e., 2+5, 2+6, 2+7, 2+8, 3+5, 3+6, 3+7, 4+5, 4+6, 5+2, 5+3, 5+4, 6+2, 6+3, 6+4, 7+2, 7+3, 8+2)
- large problems: sums higher than 10
  
***2. Uittenhove et al. (2016):***  

**- small problems**

**- large problems**

**Control variables**: 

- WM (direct and indirect)
- Raven matrices

**Possible comparisons**:

- tie vs. non-tie problems (grouped)
- tie vs. non-tie problems base on problem size (2+2 vs. 3+1; 4+4 = 8 vs. 5+3 = 8, 5+5 vs 7+3, 6+6 vs. 7+5; 7+7 vs. 8+6; 8+8 vs. 9+7, 9+9 vs. ?)
- see the problem size within each tie and non-tie group
- en Bagnoud no comparan el size effect entre éstos, sólo se basan en las gráficas, se podría mirar con Generalized Additive Models?

No puden ser los tie resolved more quickly because they are also practiced in multiplications? 2x2, 2x3, 2x4, 2x5, 2x6, 2x7, 2x8, 2x9? Ha considerado alguien esto?

Title: **Developmental changes in size effect for simple addition problems in first graders and sixth graders**

**Abstract**

For the first time/We replicated recent findings

The size effect

**Introduction**

- retrieval
- procedures (Thevenot, Uittenhove)


**Predictions**

sum = 10 special case
sum > 10 qualitatively different?
rule 9+n may be used more by Grade 6 children in comparison to Grade 1 children

0-problems and 1-problems are special category sums (rule-based) and will be evaluated separately: no size effect should be observed
- creo que es importante corregir por el voice-key


For ties, substantial size effect should be observed at the beginning, but not at the end of acquisition. ~ First kids use counting, but end up using more efficient strategies, such as direct retrieval
- 1. but could still present interference effect, however, this will be smaller in magnitude because ties belong to smaller memory network (in other words, weaker interference in comparison to non-tie problems)
  - Campbell & Oliphant (1992)
  - Graham & Campbell (19992)
- 2. or, more efficient memory access
  - Ashcraft & Battaglia (1978)
  - Campbell & Gunter (2002)
  - LeFevre et al. (2004)


For non-tie problems, 

The size effect may not be linear but quartic: 
- 1st cuadratic trend from sum 2 to 10; steep increase from 5 - 6 - 7 followed by a plateau from 7 - 8 - 9 and then 10 resolved faster because it is a special category
- 2nd cuadratic trend from 10 to 17, because non-tie sum 16 and 17 carried out by adding to 9 - adding to 9 may be more rule based? (9 + n ~ 10 + n-1 )
- we can also test this 9+n = x vs. y + z = x
- if sum to 10 eliminated  and 9+n eliminated, the trend should be linear
- there will be group differences (Grade 1 slower than Grade 6)
- not sure about the quartic trend in Grade 1, may not apply the 9+n rule - only cubic trend?

Tie and non-tie problems will follow the same trend in Grade 1 children (size effect)

Tie and non-tie problems will follow different trend in Grade 6 children, size effect in non-ties and no size effect in ties


**Method**

In total, 61 children form Grade 1 and Grade 6 were involved in this study. The data of one first grader were discarded because he did not complete the task, and the data of one sixth grader were also discarded because of technical issues. Therefore, the cross-sectional analysis involved 31 first graders (12 female; mean age = 6.19 years, SD = 0.40 years ) and 28 sixth graders (15 female; mean age = 11.52 years, SD = 0.63 years).

We tested the children in a Spanish public (?) school.

The study was approved by the ethics committee of the University of Granada (CEIH: ) and parental consent was obtained before starting the experiment.

- corrected vision? native language?

**Material and procedure**

Children were tested individually in a single session during which they completed an arithmetic task solving simple additions, and two control tasks. The control tasks were the direct and inverse digit span (citation) to control for working memory capacity, and Raven matrices (citation) to control for fluid intelligence. The arithmetic task involved operands from 0 to 9. This task was designed in and presented by the E-Prime software (citation).
  
**Data Analysis**

**Results**

**Discussion**

- small summary

- finding 1

- finding 2

- finding 3

- finding 4

Install packages, uncomment to run
```{r}
# install.packages("JuliaCall")
# install.packages("data.table")
# install.packages("formattable")
```


Load packages
```{r}
library(tidyverse)
library(rethinking)
library(knitr)
library(kableExtra)
library(lme4)
library(gridExtra)
library(ggeffects)
library(JuliaCall)

library(data.table)
library(formattable)


# julia <- julia_setup(JULIA_HOME = "/Userd/Filip/Applications") #no funciona

sessionInfo()

getwd()
setwd("/Users/Filip/Desktop/PhD 2020/__20210227 - paper Gloria niños/glo_2102/glo_2102")
```


TO-DO:
1. participant numbers
2. 1 + 1 as tie, not sum_n1
3. WM measures
4. Raven measures

run the analysis

run the analysis again and correct for naming (as in Uittenhove: problem - no es nada clara en cómo lo hace)


Load data
```{r}
nam.p <- read_csv2("primaria naming.csv", skip = 1) #naming primaria
nam.s <- read_csv2("sexto naming.csv", skip = 1) #naming sexto
d.p <- read_csv2("primaria sumas.csv", skip = 1) #data primaria
d.s <- read_csv2("sexto sumas.csv", skip = 1) #data sexto

err_primero <- read_csv2("primaria_err_row_id.csv")
err_sexto <- read_csv2("sexto_err_row_id.csv")

```


Pre-processing

Primaria
```{r}
d.p <- read_csv2("primaria sumas.csv", skip = 1) #data primaria
#colnames(d.p)

d.p <- d.p %>% 
  filter(Bloques != is.na(TRUE)) %>% 
  dplyr::select(subject = Subject,
                #age = Edad,
                group = Group,
                #handedness = PreferenciaManual,
                #sex = Sexo,
                #block = ListBloques,
                #subtrial = SubTrial, #n de trial dentro de cada bloque
                item = "code[SubTrial]",
                #trial = Listensayos.Sample,
                op.1 = "ope1[SubTrial]",
                op.2 = "ope2[SubTrial]",
                sum = resultados,
                ACC = "suma.ACC[SubTrial]",
                RT = "suma.RT[SubTrial]"
                ) %>% 
  arrange(subject) %>% 
  mutate(row_id = 1:n(),
         exp = "primaria",
         tie = if_else(op.1 == op.2 & op.1 != 0, 1, 0),
         tie_small = if_else(tie == 1 & sum <= 8, 1, 0),
         tie_large = if_else(tie == 1 & sum > 8, 1, 0),
         tie_degree = case_when(tie_small == 1 ~ 1,
                                tie_large == 1 ~ 2,
                                tie_small != 1 ~ 0,
                                tie_large != 1 ~ 0,
                                TRUE ~ 0),
         sum_n0 = if_else(op.1 == 0 | op.2 == 0, 1,0),
         sum_n1 = if_else(tie != 1 & sum_n0 != 1 & op.1 == 1 | tie != 1 & sum_n0 != 1 & op.2 == 1, 1, 0),
         sum_small = if_else(tie != 1 & sum_n0 != 1 & sum_n1 != 1 & op.1 <= 4 & op.2 <= 4, 1, 0), #small y medium a lo mejor no tiene sentido
         sum_medium = if_else(tie != 1 & sum_n0 != 1 & sum_n1 != 1 & sum_small != 1 & sum <= 10, 1, 0),
         sum_large = if_else(tie != 1 & sum_n0 != 1 & sum_n1 != 1 & sum_small != 1 & sum_medium != 1, 1 , 0),
         sum_degree = case_when(sum_small == 1 ~ 1,
                                sum_medium == 1 ~ 2,
                                sum_large == 1 ~ 3,
                                TRUE ~ 0)
         ) %>% 
  relocate(exp, .before = subject) %>% 
  relocate(row_id, .before = exp)

err_primero <- read_csv2("primaria_err_row_id.csv")

d.p$ACC <- err_primero$ACC
#sum(d.p$ACC) #651 erores

d.p$subject <- as.factor(d.p$subject) #6 and 28 eliminated
levels(d.p$subject)
levels(d.p$subject) <- c("S001","S002","S003","S004","S005","S006","S007","S008","S009","S010","S011","S012","S013","S014","S015","S016","S017","S018","S019","S020","S021","S022","S023","S024","S025","S026","S027","S028","S029","S030")
levels(d.p$subject)

# check:
# num_items *3 (blocks) *30 (participants)

# sum(d.p$tie) #810
# 9*3*30
# sum(d.p$tie_small) #360
# 4*3*30
# sum(d.p$tie_large) #450
# 5*3*30
# sum(d.p$sum_n0) #1710
# 19*3*30
# sum(d.p$sum_n1) #1440
# 16*3*30
# sum(d.p$sum_small) #540
# 6*3*30
# sum(d.p$sum_medium) #1620
# 18*3*30
# sum(d.p$sum_large) #2880
# 32*3*30

#9000 obs = 30 subjects, need to add 28

#TO-DO: recodificar participantes
```
Sexto
```{r}
d.s <- read_csv2("sexto sumas.csv", skip = 1) #data sexto

d.s <- d.s %>% 
  filter(Bloques != is.na(TRUE)) %>% 
  dplyr::select(subject = Subject,
                #age = Edad,
                group = Group,
                #handedness = PreferenciaManual,
                #sex = Sexo,
                #block = ListBloques,
                #subtrial = SubTrial, #n de trial dentro de cada bloque
                item = "code[SubTrial]",
                #trial = Listensayos.Sample,
                op.1 = "ope1[SubTrial]",
                op.2 = "ope2[SubTrial]",
                sum = resultados,
                ACC = "suma.ACC[SubTrial]",
                RT = "suma.RT[SubTrial]"
                ) %>% 
  arrange(subject) %>% 
  mutate(row_id = 1:n(),
         exp = "primaria",
         tie = if_else(op.1 == op.2 & op.1 != 0, 1, 0),
         tie_small = if_else(tie == 1 & sum <= 8, 1, 0),
         tie_large = if_else(tie == 1 & sum > 8, 1, 0),
         tie_degree = case_when(tie_small == 1 ~ 1,
                                tie_large == 1 ~ 2,
                                tie_small != 1 ~ 0,
                                tie_large != 1 ~ 0,
                                TRUE ~ 0),
         sum_n0 = if_else(op.1 == 0 | op.2 == 0, 1,0),
         sum_n1 = if_else(tie != 1 & sum_n0 != 1 & op.1 == 1 | tie != 1 & sum_n0 != 1 & op.2 == 1, 1, 0), # sums 1 + 0 also classified as n_1
         sum_small = if_else(tie != 1 & sum_n0 != 1 & sum_n1 != 1 & op.1 <= 4 & op.2 <= 4, 1, 0),
         sum_medium = if_else(tie != 1 & sum_n0 != 1 & sum_n1 != 1 & sum_small != 1 & sum <= 10, 1, 0),
         sum_large = if_else(tie != 1 & sum_n0 != 1 & sum_n1 != 1 & sum_small != 1 & sum_medium != 1, 1 , 0),
         sum_degree = case_when(sum_small == 1 ~ 1,
                       sum_medium == 1 ~ 2,
                       sum_large == 1 ~ 3,
                       TRUE ~ 0)
         ) %>% 
  relocate(exp, .before = subject) %>% 
  relocate(row_id, .before = exp)

err_sexto <- read_csv2("sexto_err_row_id.csv")

d.s$ACC <- err_sexto$ACC
#sum(d.s$ACC) #251 errors

d.s$subject <- as.factor(d.s$subject) #14 eliminated
levels(d.s$subject)
levels(d.s$subject) <- c("S031","S032","S033","S034","S035","S036","S037","S038","S039","S040","S041","S042","S043","S044","S045","S046","S047","S048","S049","S050","S051","S052","S053","S054","S055","S056","S057","S058")
levels(d.s$subject)

#TO-DO: recodificar participantes
#something like "seq_along(participant) for i in participant <- 36+i
```
Merge the dfs together
```{r}
d <- rbind(d.p, d.s) #merge the two dfs
rm(d.p, d.s, err_primero, err_sexto) #clean the workspace
```


Re-definining variable types and specifying contrasts
```{r}

#Categorical variables as factors

#tie vs. non-tie: sum-to-zero contrast
d$tie <- as.factor(d$tie)
(levels(d$tie) <- c("non-tie", "tie"))
(contrasts(d$tie) <- contr.sum(2))

#grade 1 vs. grade 6: sum-to-zero-contrast
d$group <- as.factor(d$group)
(levels(d$group) <- c("grade 1", "grade 6"))
(contrasts(d$group) <- contr.sum(2))


# #TO-DO: cuando quiera comparar los small, medium, and large sums
# 
# #Quitar el extra primero para especificar a los contrastes??
# 
# 
# #sum_degree sequential differences contrast
# d$sum_degree <- as.factor(d$sum_degree)
# levels(d$sum_degree)
# (levels(d$sum_degree) <- c("extra", "small", "medium", "large"))
# 
# #tie_degree sum-to-zero contrast
# d$tie_degree <- as.factor(d$tie_degree)
# levels(d$tie_degree)
# (levels(d$tie_degree) <- c("extra", "small", "large"))


#Continuous variables: scale and center
# need to eliminate outliers first


# levels(d$subject)

d$item <- as.factor(d$item)
levels(d$item)
```

Accuracy summary
```{r, results = "asis"}
#errors per group
t.1.1 <- d %>%
  group_by(group) %>% 
  summarize(accuracy = (1-mean(ACC))*100) %>% 
  print() %>% 
  knitr::kable(caption = "Table 1. Overall accuracy for each group", digits = 2) %>% 
  kableExtra::kable_styling(full_width = FALSE)

#tie vs. non-tie per group
t.1.2 <- d %>%
  group_by(group, tie) %>% 
  summarize(accuracy = (1-mean(ACC))*100) %>% 
  print()

#sum n+0
t.2.1 <- d %>%
  filter(sum_n1 == 1) %>% 
  group_by(group) %>% 
  summarize(accuracy = (1-mean(ACC))*100) %>% 
  print()

#sum n+1
t.2.2 <- d %>%
  filter(sum_n0 == 1) %>% 
  group_by(group) %>% 
  summarize(accuracy = (1-mean(ACC))*100) %>% 
  print()

#small, medium and large sums per group
t.2.3 <- d %>% 
  filter(sum_degree != 0) %>% 
  group_by(group, sum_degree) %>% 
  summarize(accuracy = (1-mean(ACC))*100) %>% 
  print()

#small tie vs. large tie per group
t.2.4 <- d %>% 
  filter(tie_degree != 0) %>% 
  group_by(group, tie_degree) %>% 
  summarize(accuracy = (1-mean(ACC))*100) %>% 
  print()
```


Eliminate outliers

filtrar 1: como Uittenhove et al. (2016)
- filtrar por encima y por debajo de las 3 desviaciones típicas de la mediana - mejor para niños?

```{r}
d.errors_with <- d %>% filter(ACC == 1)
d.errors_without <- d %>% filter(ACC != 1)
```
There were `r(nrow(d.errors_with)/nrow(d))*100`% of errors.

```{r}
#coding outliers in error-free data
d.errors_without <- d.errors_without %>% group_by(subject) %>% 
  mutate(outliers = if_else(RT > median(RT) + 3*sd(RT) | RT < median(RT) - 3*sd(RT), 1, 0))

d.outliers_with <- d.errors_without %>% filter(outliers == 1)
d.outliers_without <- d.errors_without %>% filter(outliers != 1)
```
There were `r(nrow(d.outliers_with)/nrow(d.errors_without))*100`% of outliers.

Data without 0-problems and 1-problems
```{r}
#data without sum_n0 and sum_n1
d.m1 <- d.outliers_without %>% filter(sum_n0 == 0 & sum_n1 == 0)
```

#filtrar datos < 200 ó 400?? 200 perception, 400/450 with articulation?
```{r}
fast_data <- d.m1 %>% filter(RT<200)
```


TO-DO: Un summary con el summary function...
```{r}

```


```{r}
rm(d.errors_with,
   d.errors_without,
   d.outliers_with)

```


RT summary
```{r, results = "asis"}
#errors per group
t.1 <- d.outliers_without %>%
  group_by(group) %>%
  summarize("Mean respone times" = mean(RT),
            "Median response times" = median(RT)) %>% print()

# %>%
#   knitr::kable(caption = "Table 1. Overall accuracy for each group", digits = 2) %>%
#   kableExtra::kable_styling(full_width = FALSE)

#sum n+0
t.2.1 <- d.outliers_without %>%
  filter(sum_n1 == 1) %>%
  group_by(group) %>%
  summarize("Mean respone times" = mean(RT),
            "Median response times" = median(RT)) %>%   print()

#sum n+1
t.2.2 <- d.outliers_without %>%
  filter(sum_n0 == 1) %>%
  group_by(group) %>%
  summarize("Mean respone times" = mean(RT),
            "Median response times" = median(RT)) %>%   print()

#tie vs. non-tie per group
t.3 <- d.outliers_without %>%
  filter(sum_n0 == 0 & sum_n1 == 0) %>%
  group_by(group, tie) %>%
  summarize("Mean respone times" = mean(RT),
            "Median response times" = median(RT)) %>% print()

#small, medium and large sums per group
t.4.1 <- d.outliers_without %>%
  #filter(sum_n0 == 0 & sum_n1 == 0) %>%
  filter(sum_degree != 0) %>%  #eliminates sum_n0 and sum_n1
  group_by(group, sum_degree) %>%
  summarize("Mean respone times" = mean(RT),
            "Median response times" = median(RT)) %>%   print()

#small tie vs. large tie per group
t.4.2 <- d.outliers_without %>%
  #filter(sum_n0 == 0 & sum_n1 == 0) %>%
  filter(tie_degree != 0) %>% #eliminates sum_n0 and sum_n1
  group_by(group, tie_degree) %>%
  summarize("Mean respone times" = mean(RT),
            "Median response times" = median(RT)) %>%   print()
```

Visualize RTs
```{r}
ggplot(d.outliers_without, aes(x = group, y = RT)) + 
  geom_boxplot()

ggplot(data = filter(d.outliers_without, group == "grade 1"), aes(x = RT)) + 
  geom_histogram()

ggplot(data = filter(d.outliers_without, group == "grade 6"), aes(x = RT)) + 
  geom_histogram()

```

Bagnoud et al. (2021)
```{r}

```


Uittenhove et al. (2016)
problema: no reportan estrategias
```{r}

```


Corregir los tiempos de reacción por el naming (igual que Uittenhove et al., 2016)
p. 293

"Average individual RTs were subsequently cor- rected according to sensitivity of the voice key by subtracting to these RTs the deviation to the mean of the naming time corre- sponding to the answer."
```{r}

```

Covariables
- load WM scores (direct and inverse) + Raven scores
```{r}

```

Resumen de analysis de Bagnoud et al. (2021):

Accuracy analized with correlations... in Supplementary materials

Response times:

Tie vs. non-tie problems (ANOVAs)
- problem type(tie vs. non-tie) + group(grade 1 vs. grade 6) + problem type:group
"For Grade 1 children, we conducted an ANOVA with age group (beginning or end) as a between-factor variable and problem type (tie or non-tie) as a within-factor variable."
 

Non-tie problems (regressions)
- small and medium non-tie problems (large problems excluded because not solved by first graders): RT ~ op.1 + op.2 | op.1*op.2 | (op.1 + op.2)^2 | op.1 | op.2 | op.max | op.min | op.1^2 + op.2^2 
- groups collapsed
- groups separated:
"When we considered each age group separately, the sum of the problem was still the best significant predictor for each group"
- ...and describing the slopes


"Even though the sum of the problem significantly predicted solution times for each age group, a close look at solution time distributions according to the problem sum showed that, already from Grade 1, solution times did not linearly increase with the sum of the problems"
- include square and cubic term

Small non-tie problems
- sum for each age group
- ?no interactions evaluated?

Medium non-tie problems
- group*sum(8 vs. 9 vs. 10)
- ?why not 7
- group*sum_7(sum_small_7 vs. sum_medium_7)
- ?why do they do it?


1-Problems
- no statistics ran

Large non-tie
- no statistics ran

Tie-problems
"it is possible to consider only the sum and the sum of the operands squared (i.e., [O1 + O2]2)"
- predictors: sum and (op.1 + op.2)^2
- pero no es verdad: op.1^2 + op.2^2 también podría serlo

Small tie problems
- sum and sum squared good predictor for Grade 1
- no predictor for Grade 3 to adulthood

Large tie problems
- gráficas: comentando 9+9
- no statistics ran


Visualize the data
```{r}
d.m1$sum.f <- as.factor(d.m1$sum)

levels(d.m1$sum.f) <- c("02", "04", "05", "06", "07", "08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18")


# visualizing non-ties
# ggplot() +
#   geom_jitter(data = filter(d.m1, group == "grade 1" & tie == "non-tie" ), aes(x = sum.f, y = RT, col = "red", alpha = 0.2)) +
#   geom_jitter(data = filter(d.m1, group == "grade 6" & tie == "non-tie"), aes(x = sum.f, y = RT, col = "blue", alpha = 0.2))

ggplot() +
  geom_boxplot(data = filter(d.m1, group == "grade 1" & tie == "non-tie" ), aes(x = sum.f, y = RT, col = "red", alpha = 0.2)) +
  geom_boxplot(data = filter(d.m1, group == "grade 6" & tie == "non-tie"), aes(x = sum.f, y = RT, col = "blue", alpha = 0.2))

# visualizing ties
# ggplot() +
#   geom_jitter(data = filter(d.m1, group == "grade 1" & tie == "tie" ), aes(x = sum.f, y = RT, col = "red", alpha = 0.2)) +
#   geom_jitter(data = filter(d.m1, group == "grade 6" & tie == "tie"), aes(x = sum.f, y = RT, col = "blue", alpha = 0.2))

ggplot() +
  geom_boxplot(data = filter(d.m1, group == "grade 1" & tie == "tie" ), aes(x = sum.f, y = RT, col = "red", alpha = 0.2)) +
  geom_boxplot(data = filter(d.m1, group == "grade 6" & tie == "tie"), aes(x = sum.f, y = RT, col = "blue", alpha = 0.2))

# visualizing ties vs non-ties in grade 1 children
# g.1.1 <- ggplot() +
#   geom_jitter(data = filter(d.m1, group == "grade 1" & tie == "tie" ), aes(x = sum.f, y = RT, col = "red", alpha = 0.2)) +
#   geom_jitter(data = filter(d.m1, group == "grade 1" & tie == "non-tie"), aes(x = sum.f, y = RT, col = "blue", alpha = 0.1)) +
#  scale_y_continuous(limits = c(0,35000))

g.1.1 <- ggplot() +
  geom_boxplot(data = filter(d.m1, group == "grade 1" & tie == "tie" ), aes(x = sum.f, y = RT, col = "red", alpha = 0.2)) +
  geom_boxplot(data = filter(d.m1, group == "grade 1" & tie == "non-tie"), aes(x = sum.f, y = RT, col = "blue", alpha = 0.1)) +
 scale_y_continuous(limits = c(0,35000))


# visualizing ties vs non-ties in grade 6 children
# ggplot() +
#   geom_jitter(data = filter(d.m1, group == "grade 6" & tie == "tie" ), aes(x = sum.f, y = RT, col = "red", alpha = 0.2)) +
#   geom_jitter(data = filter(d.m1, group == "grade 6" & tie == "non-tie"), aes(x = sum.f, y = RT, col = "blue", alpha = 0.1)) +
#   scale_y_continuous(limits = c(0,35000))

g.6.1 <- ggplot() +
  geom_boxplot(data = filter(d.m1, group == "grade 6" & tie == "tie" ), aes(x = sum.f, y = RT, col = "red", alpha = 0.2)) +
  geom_boxplot(data = filter(d.m1, group == "grade 6" & tie == "non-tie"), aes(x = sum.f, y = RT, col = "blue", alpha = 0.1)) +
  scale_y_continuous(limits = c(0,35000))


# grid.arrange(g.1.1, g.6.1, nrow = 1)

```


Visualize the data on a logarithmic scale
```{r}

# visualizing non-ties
# ggplot() +
#   geom_jitter(data = filter(d.m1, group == "grade 1" & tie == "non-tie" ), aes(x = sum.f, y = log(RT), col = "red", alpha = 0.2)) +
#   geom_jitter(data = filter(d.m1, group == "grade 6" & tie == "non-tie"), aes(x = sum.f, y = log(RT), col = "blue", alpha = 0.2))

ggplot() +
  geom_boxplot(data = filter(d.m1, group == "grade 1" & tie == "non-tie" ), aes(x = sum.f, y = log(RT), col = "red", alpha = 0.2)) +
  geom_boxplot(data = filter(d.m1, group == "grade 6" & tie == "non-tie"), aes(x = sum.f, y = log(RT), col = "blue", alpha = 0.2))

# visualizing ties
# ggplot() +
#   geom_jitter(data = filter(d.m1, group == "grade 1" & tie == "tie" ), aes(x = sum.f, y = log(RT), col = "red", alpha = 0.2)) +
#   geom_jitter(data = filter(d.m1, group == "grade 6" & tie == "tie"), aes(x = sum.f, y = log(RT), col = "blue", alpha = 0.2))

ggplot() +
  geom_boxplot(data = filter(d.m1, group == "grade 1" & tie == "tie" ), aes(x = sum.f, y = log(RT), col = "red", alpha = 0.2)) +
  geom_boxplot(data = filter(d.m1, group == "grade 6" & tie == "tie"), aes(x = sum.f, y = log(RT), col = "blue", alpha = 0.2))

# visualizing ties vs non-ties in grade 1 children
# ggplot() +
#   geom_jitter(data = filter(d.m1, group == "grade 1" & tie == "tie" ), aes(x = sum.f, y = log(RT), col = "red", alpha = 0.2)) +
#   geom_jitter(data = filter(d.m1, group == "grade 1" & tie == "non-tie"), aes(x = sum.f, y = log(RT), col = "blue", alpha = 0.01)) +
#   scale_y_continuous(limits = c(2,11))

# visualizing ties vs non-ties in grade 1 children
ggplot() +
  geom_boxplot(data = filter(d.m1, group == "grade 1" & tie == "tie" ), aes(x = sum.f, y = log(RT), col = "red", alpha = 0.2)) +
  geom_boxplot(data = filter(d.m1, group == "grade 1" & tie == "non-tie"), aes(x = sum.f, y = log(RT), col = "blue", alpha = 0.2)) +
  scale_y_continuous(limits = c(2,11))


# visualizing ties vs non-ties in grade 6 children
# ggplot() +
#   geom_jitter(data = filter(d.m1, group == "grade 6" & tie == "tie" ), aes(x = sum.f, y = log(RT), col = "red", alpha = 0.2)) +
#   geom_jitter(data = filter(d.m1, group == "grade 6" & tie == "non-tie"), aes(x = sum.f, y = log(RT), col = "blue", alpha = 0.2)) +
#   scale_y_continuous(limits = c(2,11))

ggplot() +
  geom_boxplot(data = filter(d.m1, group == "grade 6" & tie == "tie" ), aes(x = sum.f, y = log(RT), col = "red", alpha = 0.2)) +
  geom_boxplot(data = filter(d.m1, group == "grade 6" & tie == "non-tie"), aes(x = sum.f, y = log(RT), col = "blue", alpha = 0.2)) +
  scale_y_continuous(limits = c(2,11))

# grid.arrange(g.1, g.6, nrow = 1)

```
#TO-DO: pequeños boxplots para cada suma


My take on it:
```{r}
d.m1$sum.s <- scale(d.m1$sum)
mean(d.m1$sum.s)
sd(d.m1$sum.s)
```


```{r}
m1.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1 + sum.s|subject) +
                (1 + tie|subject) +
                (1 + sum.s:tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE
               )      
# doesn't converge

library(afex)

all_fit(m1.rt)

#(1 + sum|subject):

   # 1|subject                   by-participant varying intercepts
   # sum|subject                 by-participant varying sum slopes
   # 1:sum|subject               correlation of varying intercepts and varying sum slopes

# (1 + tie|subject) +

   # 1|subject                   by-participant varying intercepts
   # tie|subject                 by-participant varying tie slopes
   # 1:tie|subject               correlation of varying intercepts and varying tie slopes

# (1 + sum:tie|subject) +

   # 1|subject                   by-participant varying intercepts
   # sum:tie|subject             by participant varying sum:tie interaction slopes
   # 1:sum:tie|subject           correlation of varying intercepts and varying sum:tie interaction slopes # the interaction of tie vs. non-tie problems and problem size can depend on the intercept

#could I rewrite it like this?
(1|subject)
(0 + sum|subject)
(0 + 1:sum|subject)
(0 + tie|subject)
(0 + 1:tie|subject)
(0 + sum:tie|subject)
(0 + 1:sum:tie|subject)


# inentar hacer los scaled gradients
# lmerControl etc.


#probar optimizadores
#aumentar numero de ciclos
#remove derivative calculations

m1.1.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1 + sum.s|subject) +
                (1 + tie|subject) +
                (1 + sum.s:tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(calc.derivs = FALSE)
               )     

all_fit(m1.1.rt)
#doesn't converge


m1.2.rt <- lmer(RT ~ group + tie + sum.s +                                         #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1 + sum.s|subject) +
                (1 + tie|subject) +
                (1 + sum.s:tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )  
summary(m1.2.rt)


m2.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1|subject) +
                (0 + sum.s|subject) +
                (0 + tie|subject) +
                (0 + sum.s:tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )

m3.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1 + sum.s + tie + sum.s:tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )
#es esto lo mismo?
m3.1.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1 + sum.s*tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )

summary(m3.rt)
summary(m3.1.rt)
#son idénticos

m3.1.1.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1 + sum.s+tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )

?anova()
anova(m3.1.rt, m3.1.1.rt)
#the more complex model: m3.1.rt is better

m3.1.2.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (sum.s*tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )

summary(m3.1.2.rt)
#otro modelo idéntico

m3.1.2.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (sum.s*tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )

#tendría sentido meter en 1|item op.1 u op.2?

# m4.rt <- lmer(RT ~ group + tie + sum.s +                                           #main effects
#                    group:tie +                                                     #first order interaction
#                    group:sum.s +                                                   #first order interactions
#                    tie:sum.s +                                                     #first order interactions
#                    group:tie:sum.s +                                               #second order interactions, critical comparison
#                 (1 + sum.s*tie|subject) +
#                 (1 + op.1*op.2|item),
#               data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
#                )
# #no lo corro a lo mejor no tiene sentido

```

Modelo m3.1.rt
```{r}
coef(m3.1.rt)

hist(residuals(m3.1.rt))
qqnorm(residuals(m3.1.rt))
plot(m3.1.rt)


# Fitting sum.s square: raději ne

# m5.rt <- lmer(RT ~ group + tie + sum.s + I(sum.s^2) +                              #main effects
#                    group:tie +                                                     #first order interaction
#                    group:sum.s +                                                   #first order interactions
#                    group:I(sum.s^2) +
#                    tie:sum.s +                                                     #first order interactions
#                    tie:I(sum.s^2) +
#                    group:tie:sum.s +                                               #second order interactions, critical comparison
#                    group:tie:I(sum.s^2) +  
#                 (1 + sum.s*tie|subject) +
#                 (1|item),
#               data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
#                )
# 
# m5.1.rt <- lmer(RT ~ group + tie + sum.s + I(sum.s^2) +                              #main effects
#                    group:tie +                                                     #first order interaction
#                    group:sum.s +                                                   #first order interactions
#                    group:I(sum.s^2) +
#                    tie:sum.s +                                                     #first order interactions
#                    tie:I(sum.s^2) +
#                    group:tie:sum.s +                                               #second order interactions, critical comparison
#                    group:tie:I(sum.s^2) +  
#                 (1 + sum.s*tie|subject) +
#                 (1|item),
#               data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE, list(maxfun = 1e9))
#                )
# 
# # m5.2.rt <- lmer(RT ~ group*tie*(sum.s + I(sum.s^2) + I(sum.s^3) + I(sum.s^4)) +
# #                 (1 + sum.s*tie|subject) +
# #                 (1|item),
# #               data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE, list(maxfun = 1e9))
# #                )
# 
# m5.3.rt <- lmer(RT ~ group*tie*sum.s + I(sum.s^2) + I(sum.s^3) + I(sum.s^4) +
#                 (1 + sum.s*tie|subject) +
#                 (1|item),
#               data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE, list(maxfun = 1e9))
#                )
# 
# summary(m5.rt)


m6.1.rt <- lmer(log(RT) ~ group + tie + sum.s +                                    #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1 + sum.s*tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )

summary(m6.1.rt)
hist(residuals(m6.1.rt))
qqnorm(residuals(m6.1.rt))
plot(m6.1.rt)


```

#TO-DO: eliminar puntos con residuals > 2
#TO-DO: visualizar las predicciones del modelo, ggpredict me visualiza solo simple effects

```{r}

summary(m3.1.rt)

m6.2.rt <- lmer(log(RT) ~ group + tie + sum.s +                                    #main effects
                   group:tie +                                                     #first order interaction
                   group:sum.s +                                                   #first order interactions
                   tie:sum.s +                                                     #first order interactions
                   group:tie:sum.s +                                               #second order interactions, critical comparison
                (1 + sum.s+tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )

anova(m6.1.rt, m6.2.rt)
#m6.1 more complex model preferable

plot(ggpredict(m6.1.rt, se = TRUE))


?ggPredict()
# These plots ptobably do not make sense:
# ggplot(data = filter(d.m1, group == "grade 1"), aes(y = log(RT), x = sum.s, color = tie)) +
#   geom_point() +
#   stat_smooth(method = "lm", se = TRUE)
# 
# ggplot(data = filter(d.m1, group == "grade 6"), aes(y = log(RT), x = sum.s, color = tie)) +
#   geom_point() +
#   stat_smooth(method = "lm", se = TRUE)

m6.1.3.rt <- lmer(log(RT) ~ 1 + group*tie*sum.s +
                (1 + sum.s*tie|subject) +
                (1|item),
              data = d.m1, REML = FALSE, lmerControl(optimizer = "bobyqa", calc.derivs = FALSE)
               )

summary(m6.1.3.rt)

anova(m6.1.rt, m6.1.3.rt) #es el mismo módelo


plot(ggpredict(m6.1.3.rt, se = TRUE)) #sigue sin predecirme las interacciones

```


```{r}
#errores que he cometido
#no me he visualizado los datos primero
#he probado la random structure a lo loco


```


Prune the random effect structure
```{r}

```

Testing higher order polynomials
```{r}
#Add the square term

m2.rt <- lme(RT ~ group + tie + sum + sum_sq +                                  #main effects
                  group:tie +                                                   #first order interaction
                  group:sum + group:sum_sq +                                    #first order interactions
                  tie:sum + tie:sum_sq +                                        #first order interactions
                  group:tie:sum + group:tie:sum_sq +                            #second order interactions
               (1 + sum|subject/group) + 
               (1 + tie|subject) + 
               (0 + sum:tie|subject)
               )      

#Add the cubic term

m3.rt <- lme(RT ~ group + tie + sum + sum_sq + sum_cub +                        #main effects
                  group:tie +                                                   #first order interaction
                  group:sum + group:sum_sq + group:sum_cub +                    #first order interactions
                  tie:sum + tie:sum_sq + tie:sum_cub +                          #first order interactions
                  group:tie:sum + group:tie:sum_sq + group:tie:sum_cub +        #second order interactions
               (1 + sum|subject/group) + 
               (1 + tie|subject) + 
               (0 + sum:tie|subject) 
               )     


#Add quartic term? Maybe


#random effect structure:
#per participant: sum, sum_sq, sum_cub, tie:sum, tie:s

#data: without sum_n0 and sum_n1
```


ACC old stuff:

```{r}
m1.acc <- glmer(ACC ~ group + tie + sum + group:tie +  +
                  (1|subject) + (1 + scale(op.1) + scale(op.2) + scale(sum)|item), #no tiene sentido ponerlo en RE si no es fixed effect?
                data = d, family = "binomial")

#did not converge

m2.acc <- glmer(ACC ~ tie + group + tie:group +
                  (1 + tie|subject) + (1|item), 
                data = d, family = "binomial")

summary(m2.acc)


#same as Bagnoud et al. (2021)

#data: only non-tie problems


#non-tie problems

#together

#by age group


#small non-tie problems for each age group
m3.acc <- glmer(ACC ~ sum + group + sum_degree:group + 
                  (1 + sum_degree|subject) + (1|item))

#medium non-tie problems for each age group
m3.acc <- glmer(ACC ~ sum + group + sum_degree:group + 
                  (1 + sum_degree|subject) + (1|item))

#large non-tie problems for each age group
m3.acc <- glmer(ACC ~ sum + group + sum_degree:group + 
                  (1 + sum_degree|subject) + (1|item))


#to mi přijde jako píčovina, proč raději netestovat size effect + include cuadratic term (or even cubic term) o excluir 10 y solo cuadratic term
# je to třeba testovat pomocí GAMM


#compare small, medium, large problems
m3.acc <- glmer(ACC ~ sum_degree + group + sum_degree:group + 
                  (1 + sum_degree|subject) + (1|item))

#test the size effect within non-tie problems
m3.1.acc <- glmer(ACC ~ sum + group + sum:group + 
                  (1 + sum_degree|subject) + (1|item))


#data: only tie problems

#compare small tie problems and large tie problems
m4.acc <- glmer(ACC ~ tie_degree + group + tie_degree:group +
                  (1 + tie_degree|subject) + (1|item))

#test the size effect within tie problems
m4.1.acc <- glmer(ACC ~ sum + group + sum:group + 
                  (1 + sum|subject) + (1|item))


#compare the sum of the problem
#TO-DO: code "sum" with sequential difference contrast

#Main effects
#sum: - if there is an overall problem size effect
#tie: - if tie problems are resolved faster than non-tie problems
#group: - if there are overall group differences


#sum:group - if the problem size effect is different between groups NO NOS INTERESA

#tie:group - if tie and non-tie problems are resolved differently between groups
#sum:tie - 

#sum:tie:group - if the difference between the problem size effect in ties and non-ties differs between groups

#sum:tie - if ties are resolved faster
#sum:tie:group - if ties and non ties are resolved differently by the two groups

m5.acc.glmer(ACC ~ tie + sum + group + sum:group + sum:tie + sum:group:tie)

#data: tendría que limitarme a 4, 6, 8, 10, 12, 14, 16, 18
#sin n+0 a n+1
```