Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estimate_contrasts() Fails to Accept Explicit Levels for Numeric Predictor with Marginaleffects Backend #415

Merged
merged 15 commits into from
Feb 21, 2025
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: modelbased
Title: Estimation of Model-Based Predictions, Contrasts and Means
Version: 0.9.0.29
Version: 0.9.0.30
Authors@R:
c(person(given = "Dominique",
family = "Makowski",
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@

* Fixed issues with contrasting slopes when `backend` was `"emmeans"`.

* Fixed issues in `estimate_contrasts()` when filtering numeric values in `by`.

* Fixed issue in `estimate_slopes()` for models from package *lme4*.

# modelbased 0.9.0
Expand Down
23 changes: 16 additions & 7 deletions R/get_marginalcontrasts.R
Original file line number Diff line number Diff line change
Expand Up @@ -154,11 +154,14 @@
...) {
# init
comparison_slopes <- by_filter <- contrast_filter <- by_token <- NULL

# make sure "by" is a valid column name, and no filter-directive,
# like "Species='setosa'". If `by` is also used for filtering, split and
# extract filter value for later - we have to filter rows manually after
# calculating contrasts. Furthermore, "clean" `by` argument (remove filter)
# save original `by`
original_by <- my_args$by

# make sure "by" is a valid column name, and no filter-directive, like
# "Species='setosa'". If `by` is also used for filtering, split and extract
# filter value for later - we have to filter rows manually after calculating
# contrasts (but only for `estimate = "average"`!). Furthermore, "clean" `by`
# argument (remove filter)
if (!is.null(my_args$by) && any(grepl("=", my_args$by, fixed = TRUE))) { # "[^0-9A-Za-z\\._]"
# find which element in `by` has a filter
filter_index <- grep("=", my_args$by, fixed = TRUE)
Expand Down Expand Up @@ -214,9 +217,9 @@
}

# convert comparison and by into a formula
if (!is.null(comparison)) {

Check warning on line 220 in R/get_marginalcontrasts.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/get_marginalcontrasts.R,line=220,col=7,[if_not_else_linter] Prefer `if (A) x else y` to the less-readable `if (!A) y else x` in a simple if/else statement.
# only proceed if we don't have custom comparisons
if (!.is_custom_comparison(comparison)) {

Check warning on line 222 in R/get_marginalcontrasts.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/get_marginalcontrasts.R,line=222,col=9,[if_not_else_linter] Prefer `if (A) x else y` to the less-readable `if (!A) y else x` in a simple if/else statement.
# if we have a formula as comparison, we convert it into strings in order
# to extract the information for "comparison" and "by", because we
# recombine the formula later - we always use variables in `by` as group
Expand Down Expand Up @@ -268,8 +271,14 @@
# remove "by" from "contrast"
my_args$contrast <- setdiff(my_args$contrast, my_args$by)

# add back token to `by`
if (!is.null(by_token)) {
# for `estimate = "average"`, we cannot create a data grid, thus we need to
# filter manually. However, for all other `estimate` options, we can simply
# use the data grid for filtering
if (!identical(estimate, "average") && !is.null(original_by)) {
my_args$by <- original_by
by_filter <- NULL
} else if (!is.null(by_token)) {
# add back token to `by`
for (i in names(by_token)) {
my_args$by[my_args$by == i] <- paste(i, by_token[[i]], sep = "=")
}
Expand Down Expand Up @@ -328,7 +337,7 @@
# for the focal terms terms
datagrid <- data.frame(expand.grid(lapply(datagrid[focal], unique)))
# this is the row-order we use in modelbased
datagrid$.rowid <- 1:nrow(datagrid)

Check warning on line 340 in R/get_marginalcontrasts.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/get_marginalcontrasts.R,line=340,col=22,[seq_linter] Use seq_len(nrow(...)) instead of 1:nrow(...), which is likely to be wrong in the empty edge case.
# this is the row-order in marginaleffects
datagrid <- datawizard::data_arrange(
datagrid,
Expand Down
62 changes: 31 additions & 31 deletions tests/testthat/_snaps/ordinal.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,38 +49,38 @@
Output
Estimated Marginal Means

Sepal.Width | Response | Median | 95% CI | pd | ROPE | % in ROPE
Sepal.Width | Response | Median | 95% CI | pd | ROPE | % in ROPE
-----------------------------------------------------------------------------------
2.00 | setosa | 0.00 | [0.00, 0.01] | 100% | [-0.10, 0.10] | 100%
2.00 | versicolor | 0.86 | [0.65, 0.95] | 100% | [-0.10, 0.10] | 0%
2.00 | virginica | 0.14 | [0.05, 0.34] | 100% | [-0.10, 0.10] | 28.00%
2.27 | setosa | 0.00 | [0.00, 0.02] | 100% | [-0.10, 0.10] | 100%
2.27 | versicolor | 0.78 | [0.60, 0.90] | 100% | [-0.10, 0.10] | 0%
2.27 | virginica | 0.21 | [0.10, 0.40] | 100% | [-0.10, 0.10] | 0%
2.53 | setosa | 0.02 | [0.01, 0.06] | 100% | [-0.10, 0.10] | 100%
2.53 | versicolor | 0.66 | [0.52, 0.78] | 100% | [-0.10, 0.10] | 0%
2.53 | virginica | 0.32 | [0.20, 0.45] | 100% | [-0.10, 0.10] | 0%
2.80 | setosa | 0.09 | [0.04, 0.16] | 100% | [-0.10, 0.10] | 65.72%
2.80 | versicolor | 0.49 | [0.39, 0.59] | 100% | [-0.10, 0.10] | 0%
2.80 | virginica | 0.41 | [0.32, 0.52] | 100% | [-0.10, 0.10] | 0%
3.07 | setosa | 0.27 | [0.19, 0.36] | 100% | [-0.10, 0.10] | 0%
3.07 | versicolor | 0.30 | [0.21, 0.39] | 100% | [-0.10, 0.10] | 0%
3.07 | virginica | 0.43 | [0.33, 0.53] | 100% | [-0.10, 0.10] | 0%
3.33 | setosa | 0.57 | [0.47, 0.68] | 100% | [-0.10, 0.10] | 0%
3.33 | versicolor | 0.12 | [0.06, 0.21] | 100% | [-0.10, 0.10] | 29.76%
3.33 | virginica | 0.30 | [0.20, 0.40] | 100% | [-0.10, 0.10] | 0%
3.60 | setosa | 0.82 | [0.68, 0.91] | 100% | [-0.10, 0.10] | 0%
3.60 | versicolor | 0.03 | [0.01, 0.08] | 100% | [-0.10, 0.10] | 100%
3.60 | virginica | 0.14 | [0.07, 0.26] | 100% | [-0.10, 0.10] | 17.37%
3.87 | setosa | 0.94 | [0.83, 0.98] | 100% | [-0.10, 0.10] | 0%
3.87 | versicolor | 0.01 | [0.00, 0.03] | 100% | [-0.10, 0.10] | 100%
3.87 | virginica | 0.05 | [0.02, 0.15] | 100% | [-0.10, 0.10] | 89.59%
4.13 | setosa | 0.98 | [0.91, 1.00] | 100% | [-0.10, 0.10] | 0%
4.13 | versicolor | 0.00 | [0.00, 0.01] | 100% | [-0.10, 0.10] | 100%
4.13 | virginica | 0.02 | [0.00, 0.08] | 100% | [-0.10, 0.10] | 100%
4.40 | setosa | 0.99 | [0.95, 1.00] | 100% | [-0.10, 0.10] | 0%
4.40 | versicolor | 0.00 | [0.00, 0.00] | 100% | [-0.10, 0.10] | 100%
4.40 | virginica | 0.01 | [0.00, 0.04] | 100% | [-0.10, 0.10] | 100%
2.00 | setosa | 0.00 | [0.00, 0.01] | 100% | [-0.10, 0.10] | 100%
2.00 | versicolor | 0.86 | [0.65, 0.95] | 100% | [-0.10, 0.10] | 0%
2.00 | virginica | 0.14 | [0.05, 0.34] | 100% | [-0.10, 0.10] | 28.00%
2.27 | setosa | 0.00 | [0.00, 0.02] | 100% | [-0.10, 0.10] | 100%
2.27 | versicolor | 0.78 | [0.60, 0.90] | 100% | [-0.10, 0.10] | 0%
2.27 | virginica | 0.21 | [0.10, 0.40] | 100% | [-0.10, 0.10] | 0%
2.53 | setosa | 0.02 | [0.01, 0.06] | 100% | [-0.10, 0.10] | 100%
2.53 | versicolor | 0.66 | [0.52, 0.78] | 100% | [-0.10, 0.10] | 0%
2.53 | virginica | 0.32 | [0.20, 0.45] | 100% | [-0.10, 0.10] | 0%
2.80 | setosa | 0.09 | [0.04, 0.16] | 100% | [-0.10, 0.10] | 65.72%
2.80 | versicolor | 0.49 | [0.39, 0.59] | 100% | [-0.10, 0.10] | 0%
2.80 | virginica | 0.41 | [0.32, 0.52] | 100% | [-0.10, 0.10] | 0%
3.07 | setosa | 0.27 | [0.19, 0.36] | 100% | [-0.10, 0.10] | 0%
3.07 | versicolor | 0.30 | [0.21, 0.39] | 100% | [-0.10, 0.10] | 0%
3.07 | virginica | 0.43 | [0.33, 0.53] | 100% | [-0.10, 0.10] | 0%
3.33 | setosa | 0.57 | [0.47, 0.68] | 100% | [-0.10, 0.10] | 0%
3.33 | versicolor | 0.12 | [0.06, 0.21] | 100% | [-0.10, 0.10] | 29.76%
3.33 | virginica | 0.30 | [0.20, 0.40] | 100% | [-0.10, 0.10] | 0%
3.60 | setosa | 0.82 | [0.68, 0.91] | 100% | [-0.10, 0.10] | 0%
3.60 | versicolor | 0.03 | [0.01, 0.08] | 100% | [-0.10, 0.10] | 100%
3.60 | virginica | 0.14 | [0.07, 0.26] | 100% | [-0.10, 0.10] | 17.37%
3.87 | setosa | 0.94 | [0.83, 0.98] | 100% | [-0.10, 0.10] | 0%
3.87 | versicolor | 0.01 | [0.00, 0.03] | 100% | [-0.10, 0.10] | 100%
3.87 | virginica | 0.05 | [0.02, 0.15] | 100% | [-0.10, 0.10] | 89.59%
4.13 | setosa | 0.98 | [0.91, 1.00] | 100% | [-0.10, 0.10] | 0%
4.13 | versicolor | 0.00 | [0.00, 0.01] | 100% | [-0.10, 0.10] | 100%
4.13 | virginica | 0.02 | [0.00, 0.08] | 100% | [-0.10, 0.10] | 100%
4.40 | setosa | 0.99 | [0.95, 1.00] | 100% | [-0.10, 0.10] | 0%
4.40 | versicolor | 0.00 | [0.00, 0.00] | 100% | [-0.10, 0.10] | 100%
4.40 | virginica | 0.01 | [0.00, 0.04] | 100% | [-0.10, 0.10] | 100%

Variable predicted: Species
Predictors modulated: Sepal.Width
Expand Down
71 changes: 71 additions & 0 deletions tests/testthat/test-estimate_contrasts-average.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
skip_on_cran()
skip_if_not_installed("emmeans")
skip_if_not_installed("marginaleffects")
skip_on_os("mac")


test_that("estimate_contrast, filter by numeric values", {
skip_if_not_installed("lme4")
data(iris)
mod <- lm(Sepal.Length ~ Petal.Width * Species, data = iris)
out1 <- estimate_contrasts(mod, contrast = "Species=c('versicolor','setosa')", by = "Petal.Width", estimate = "average")

Check warning on line 11 in tests/testthat/test-estimate_contrasts-average.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-estimate_contrasts-average.R,line=11,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 122 characters.
expect_identical(dim(out1), c(5L, 10L))
expect_equal(out1$Difference, c(0.13903, 0.06148, -0.01608, -0.09363, -0.17118), tolerance = 1e-4)

data(CO2)
mod <- suppressWarnings(lme4::lmer(uptake ~ conc * Plant + (1 | Type), data = CO2))
out1 <- estimate_contrasts(mod, contrast = "Plant", by = "conc", estimate = "average")
expect_identical(dim(out1), c(462L, 10L))

out1 <- estimate_contrasts(mod, contrast = "Plant=c('Qn1','Qn2','Qn3')", by = "conc", estimate = "average")
expect_identical(dim(out1), c(21L, 10L))

out1 <- estimate_contrasts(mod, contrast = "Plant=c('Qn1','Qn2','Qn3')", estimate = "average")
expect_identical(dim(out1), c(3L, 9L))
expect_equal(out1$Difference, c(1.92857, 4.38571, 2.45714), tolerance = 1e-4)

out <- estimate_contrasts(mod, contrast = "conc", by = "Plant", comparison = "b1=b2", estimate = "average")
expect_equal(out$Difference, -0.007061251, tolerance = 1e-4)
})


test_that("estimate_contrast, filterin in `by` and `contrast`", {
data(efc, package = "modelbased")
efc <- datawizard::to_factor(efc, c("c161sex", "c172code", "e16sex", "e42dep"))
levels(efc$c172code) <- c("low", "mid", "high")
m <- lm(neg_c_7 ~ barthtot + c172code * e42dep + c161sex, data = efc)

out <- estimate_contrasts(m, c("e42dep", "c172code"), estimate = "average")
expect_identical(dim(out), c(66L, 9L))

out <- estimate_contrasts(
m,
c("e42dep=c('independent','slightly dependent','moderately dependent')"),

Check warning on line 43 in tests/testthat/test-estimate_contrasts-average.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-estimate_contrasts-average.R,line=43,col=5,[unnecessary_concatenation_linter] Remove unnecessary c() of a constant.
by = "c172code",
estimate = "average"
)
expect_identical(dim(out), c(9L, 10L))
expect_equal(
out$Difference,
c(
-0.56667, 0.87147, 1.43814, 1.30144, 3.00341, 1.70197, 2.78974,
3.11667, 0.32692
),
tolerance = 1e-4
)

out <- estimate_contrasts(
m,
"e42dep=c('independent','slightly dependent','moderately dependent')",
by = "c172code",
comparison = "b1=b4",
estimate = "average"
)
expect_equal(out$Difference, 1.507576, tolerance = 1e-4)

out <- estimate_contrasts(m, "e42dep", by = "c172code=c('low','mid')", estimate = "average")
expect_identical(dim(out), c(12L, 10L))

out <- estimate_contrasts(m, "e42dep=c('independent','slightly dependent')", by = "c172code=c('low','mid')", estimate = "average")

Check warning on line 69 in tests/testthat/test-estimate_contrasts-average.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-estimate_contrasts-average.R,line=69,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 132 characters.
expect_identical(dim(out), c(2L, 10L))
})
87 changes: 87 additions & 0 deletions tests/testthat/test-estimate_contrasts.R
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
estim <- suppressMessages(estimate_contrasts(model, by = "Species=c('versicolor', 'virginica')", backend = "emmeans"))
expect_identical(dim(estim), c(1L, 9L))

estim <- suppressMessages(estimate_contrasts(model, contrast = "Species=c('versicolor', 'virginica')", backend = "marginaleffects"))

Check warning on line 32 in tests/testthat/test-estimate_contrasts.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-estimate_contrasts.R,line=32,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 134 characters.
expect_identical(dim(estim), c(1L, 9L))
})

Expand All @@ -50,7 +50,7 @@
expect_identical(dim(estim), c(3L, 10L))
estim <- suppressMessages(estimate_contrasts(model, contrast = "Species", backend = "marginaleffects"))
expect_identical(dim(estim), c(3L, 9L))
estim <- suppressMessages(estimate_contrasts(model, contrast = "Species", by = "fac='A'", backend = "marginaleffects"))

Check warning on line 53 in tests/testthat/test-estimate_contrasts.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-estimate_contrasts.R,line=53,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 121 characters.
expect_identical(dim(estim), c(3L, 10L))
})

Expand Down Expand Up @@ -110,11 +110,11 @@
# estim <- suppressMessages(estimate_contrasts(model, by = "all", backend = "marginaleffects"))
# expect_identical(dim(estim), c(12L, 11L))

estim <- suppressMessages(estimate_contrasts(model, contrast = c("vs", "am"), by = "gear='5'", backend = "marginaleffects"))

Check warning on line 113 in tests/testthat/test-estimate_contrasts.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-estimate_contrasts.R,line=113,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 126 characters.
expect_identical(dim(estim), c(6L, 10L))
expect_equal(estim$Difference, c(6.98333, 11.275, 18.25833, 4.29167, 11.275, 6.98333), tolerance = 1e-4)

estim <- suppressMessages(estimate_contrasts(model, contrast = c("vs", "am"), by = "gear", backend = "marginaleffects"))

Check warning on line 117 in tests/testthat/test-estimate_contrasts.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-estimate_contrasts.R,line=117,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 122 characters.
expect_identical(dim(estim), c(18L, 10L))
expect_named(estim, c("Level1", "Level2", "gear", "Difference", "SE", "CI_low", "CI_high", "t", "df", "p"))
expect_equal(
Expand Down Expand Up @@ -810,3 +810,90 @@
)
)
})


test_that("estimate_contrast, filter by numeric values", {
skip_if_not_installed("lme4")
data(iris)
mod <- lm(Sepal.Length ~ Petal.Width * Species, data = iris)
out1 <- estimate_contrasts(mod, contrast = "Species=", by = "Petal.Width=c(1,2,3)", backend = "marginaleffects")
out2 <- estimate_contrasts(mod, contrast = "Species=", by = "Petal.Width=c(1,2,3)", backend = "emmeans")
expect_identical(dim(out1), c(9L, 10L))
expect_identical(dim(out2), c(9L, 10L))
expect_equal(
out1$Difference,
c(-0.23635, 0.2129, 0.44924, 0.25985, -0.06644, -0.32629, 0.75604, -0.34579, -1.10183),
tolerance = 1e-4
)
expect_equal(
out2$Difference,
c(0.23635, -0.25985, -0.75604, -0.2129, 0.06644, 0.34579, -0.44924, 0.32629, 1.10183),
tolerance = 1e-4
)

out1 <- estimate_contrasts(mod, contrast = "Species=c('versicolor','setosa')", by = "Petal.Width=c(1,2,3)", backend = "marginaleffects")
out2 <- estimate_contrasts(mod, contrast = "Species=c('versicolor','setosa')", by = "Petal.Width=c(1,2,3)", backend = "emmeans")
expect_identical(dim(out1), c(3L, 10L))
expect_identical(dim(out2), c(3L, 10L))
expect_equal(out1$Difference, -1 * out2$Difference, tolerance = 1e-4)

data(CO2)
mod <- suppressWarnings(lme4::lmer(uptake ~ conc * Plant + (1 | Type), data = CO2))
out1 <- estimate_contrasts(mod, contrast = "Plant", by = "conc=c(100,200)", backend = "marginaleffects")
out2 <- estimate_contrasts(mod, contrast = "Plant", by = "conc=c(100,200)", backend = "emmeans")
expect_identical(dim(out1), c(132L, 10L))
expect_identical(dim(out2), c(132L, 10L))

out1 <- estimate_contrasts(mod, contrast = "Plant=c('Qn1','Qn2','Qn3')", by = "conc=c(100,200)", backend = "marginaleffects")
out2 <- estimate_contrasts(mod, contrast = "Plant=c('Qn1','Qn2','Qn3')", by = "conc=c(100,200)", backend = "emmeans")
expect_identical(dim(out1), c(6L, 10L))
expect_identical(dim(out2), c(6L, 10L))
expect_equal(out1$Difference[c(1, 6)], -1 * out2$Difference[c(1, 6)], tolerance = 1e-4)

out1 <- estimate_contrasts(mod, contrast = "Plant=c('Qn1','Qn2','Qn3')", backend = "marginaleffects")
out2 <- estimate_contrasts(mod, contrast = "Plant=c('Qn1','Qn2','Qn3')", comparison = "b1=b2", backend = "marginaleffects")
expect_equal(out1$Difference[1], -1 * out2$Difference, tolerance = 1e-4)

out1 <- estimate_contrasts(mod, contrast = "conc", by = "Plant=c('Mc2','Mn1','Qn3')")
expect_equal(out1$Difference, c(0.01746, 0.01782, 0.00036), tolerance = 1e-3)

out <- estimate_contrasts(mod, contrast = "conc", by = "Plant=c('Mc2','Mn1','Qn3')", comparison = "b1=b2")
expect_equal(out$Difference, -0.01745914, tolerance = 1e-4)
})


test_that("estimate_contrast, filterin in `by` and `contrast`", {
data(efc, package = "modelbased")
efc <- datawizard::to_factor(efc, c("c161sex", "c172code", "e16sex", "e42dep"))
levels(efc$c172code) <- c("low", "mid", "high")
m <- lm(neg_c_7 ~ barthtot + c172code * e42dep + c161sex, data = efc)

out <- estimate_contrasts(m, c("e42dep", "c172code"))
expect_identical(dim(out), c(66L, 9L))

out <- estimate_contrasts(
m,
c("e42dep=c('independent','slightly dependent','moderately dependent')"),
by = "c172code"
)
expect_identical(dim(out), c(9L, 10L))
expect_equal(
out$Difference,
c(
-0.77851, 0.12142, 0.89993, 0.87674, 1.97996, 1.10322, 2.69591,
2.59613, -0.09978
),
tolerance = 1e-4
)

out <- estimate_contrasts(
m,
"e42dep=c('independent','slightly dependent','moderately dependent')",
by = "c172code",
comparison = "b1=b4"
)
expect_equal(out$Difference, 1.163197, tolerance = 1e-4)

out <- estimate_contrasts(m, "e42dep", by = "c172code=c('low','mid')")
expect_identical(dim(out), c(12L, 10L))
})
Loading