-
-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
warnings/approximation of observation-level variance in .get_variance_distributional #877
Comments
Ben, I added a test (see file test-r2_nakagawa_MuMIn.R in #883) to check results from I will then revise the computation in One difference I found is the computation of fixed effects variance. insight uses: var(as.vector(fixef(m) %*% t(getME(m, "X"))) MuMIn uses: var(fitted(m)) The are often very similar, but sometime (e.g., Poisson mixed model with random slope) can differ slightly. What would you say is the more accurate approach to extract fixed effects variances? |
What is not very well tested are zero-inflated and hurdle models from glmmTMB. Not sure, though, against which functions/values test results can be validated? |
I think I need more context; what is the "fixed effects variance" supposed to mean/how is it used? Are we talking about the variance on the link/linear predictor scale or on the data/response scale? |
fixed effects variance: See Eq. 2.4 https://royalsocietypublishing.org/doi/10.1098/rsif.2017.0213 |
OK. As far as I can tell quickly, it does seem that |
I think I need some statistical advice, or how modelling packages behave. I'm referring to Suppl. 2 of Nakagawa: On page 5, "Quasi-Poisson" models are described, however, package glmmadmb is used with Negative binomial models start on page 9, where The difference is the log-approximation. For quasi-poisson, it's I think I would now use: log(1 + omegaN / lambda) # quasi-poisson and poisson
log(1 + (1 / lambda) + (1 / thetaN)) # neg-binomial
# where lambda is "mu" in insight
# and omegaN is "vv", while thetaN is "sig" What do you think? The affected code starts here: Line 656 in fa1ac63
And the Nakagawa-code to compute log-approximation ( Lines 764 to 773 in fa1ac63
|
Ok, here's a comparison of the Nakagawa code, performance and MuMIn. There a minimal differences between Nakagawa and performance, due to slightly different residual variance - I debugged step by step and compared values, I'm not sure how this occurs. I'll try to figure out. The main take-away is: the suggested code by Nakagawa seems not to be correctly implemented in MuMIn for negative-binomial. It's probably due to the confusion of the example I've shown above: glmmadmb uses library(performance)
library(insight)
library(glmmTMB)
# Models
glmmTMBr <- glmmTMB::glmmTMB(
count ~ (1 | site),
family = glmmTMB::nbinom1(),
data = Salamanders, REML = TRUE
)
glmmTMBf <- glmmTMB::glmmTMB(
count ~ mined + spp + (1 | site),
family = glmmTMB::nbinom1(),
data = Salamanders, REML = TRUE
) Code by hand from Nakagawa suppl.# Calculation of the variance in fitted values
VarF <- var(as.vector(get_modelmatrix(glmmTMBf) %*% fixef(glmmTMBf)$cond))
# this is "mu" in insight
lambda <- as.numeric(exp(fixef(glmmTMBr)$cond + 0.5 * (as.numeric(VarCorr(glmmTMBr)$cond[1]))))
# this is "sig" in insight
thetaF <- sigma(glmmTMBf) # note that theta is called alpha in glmmadmb
# this is what ".variance_distributional()" returns
VarOdF <- 1 / lambda + 1 / thetaF # the delta method
VarOlF <- log(1 + (1 / lambda) + (1 / thetaF)) # log-normal approximation
VarOtF <- trigamma((1 / lambda + 1 / thetaF)^(-1)) # trigamma function
# R2[GLMM(m)] - marginal R2[GLMM]
R2glmmM <- VarF / (VarF + sum(as.numeric(VarCorr(glmmTMBf)$cond)) + VarOlF)
# R2[GLMM(c)] - conditional R2[GLMM] for full model
R2glmmC <- (VarF + sum(as.numeric(VarCorr(glmmTMBf)$cond))) / (VarF + sum(as.numeric(VarCorr(glmmTMBf)$cond)) + VarOlF) Comparison lognormal# R2 based on Suppl. of Nakagawa et al. 2017, lognormal
c(R2glmmM = R2glmmM, R2glmmC = R2glmmC)
#> R2glmmM R2glmmC
#> 0.5860120 0.6931856
# current implementation - not sure *where* this small deviation comes from
# maybe it's the NULL model?
performance::r2_nakagawa(glmmTMBf, null_model = glmmTMBr)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.717
#> Marginal R2: 0.606
# What MuMIn calculates
suppressWarnings(MuMIn::r.squaredGLMM(glmmTMBf)[2, ])
#> R2m R2c
#> 0.5172092 0.6117996 Comparison deltaR2glmmM <- VarF / (VarF + sum(as.numeric(VarCorr(glmmTMBf)$cond)) + VarOdF)
R2glmmC <- (VarF + sum(as.numeric(VarCorr(glmmTMBf)$cond))) / (VarF + sum(as.numeric(VarCorr(glmmTMBf)$cond)) + VarOdF)
# R2 based on Suppl. of Nakagawa et al. 2017, delta
c(R2glmmM = R2glmmM, R2glmmC = R2glmmC)
#> R2glmmM R2glmmC
#> 0.5119224 0.6055460
performance::r2_nakagawa(glmmTMBf, null_model = glmmTMBr, approximation = "delta")
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.642
#> Marginal R2: 0.543
# What MuMIn calculates
suppressWarnings(MuMIn::r.squaredGLMM(glmmTMBf)[1, ])
#> R2m R2c
#> 0.3989273 0.4718856 Reason for minor deviations of performance from Nakagawa# The small deviation between performance and Nakagawa seem to be residual variance,
# though I'm not sure where it differes
vars <- get_variance(glmmTMBf, null_model = glmmTMBr)
all.equal(vars$var.fixed, VarF)
#> [1] TRUE
all.equal(sum(vars$var.intercept), sum(as.numeric(VarCorr(glmmTMBf)$cond)))
#> [1] TRUE
# minor difference - need to check
all.equal(vars$var.residual, VarOlF)
#> [1] "Mean relative difference: 0.1186565" Created on 2024-06-13 with reprex v2.1.0 This is the code to replicate MuMIn: switch(faminfo$family,
nbinom2 = ,
`negative binomial` = (1 / mu) + (1 / sig),
nbinom = ,
nbinom1 = ,
poisson = ,
quasipoisson = vv / mu,
vv / mu^2
) This is the code to replicate Nakagawa. Note that nbinom1 moved up. glmmadmb with switch(faminfo$family,
nbinom1 = ,
nbinom2 = ,
`negative binomial` = (1 / mu) + (1 / sig),
nbinom = ,
poisson = ,
quasipoisson = vv / mu,
vv / mu^2
) |
Ok, I found the difference. It's in the internal function .get_sigma.glmmTMB <- function(x, ...) {
if (stats::family(x)$family == "nbinom1") {
add_value <- 1
} else {
add_value <- 0
}
stats::sigma(x) + add_value
} I did this to be in line with and get the same results as MuMIn. However, when I use .get_sigma.glmmTMB <- function(x, ...) {
stats::sigma(x)
} I get identical results to Nakagawa. |
…_distributional (#883) * warnings/approximation of observation-level variance in .get_variance_distributional Fixes #877 * progress * version * add tests * ... * fix issues * fix * tests * ... * fix * tests * add tests * structure * fix test * nbinom * add nbinom2 * add test * test zi * fix negbin * fix * comes closest * fix * fix * remove temp file * styler * add test * news * fix * fix * fix * try out * typo * lintr * fix tests * works with glmmadmb nbinom1 * fix * update * add comment * fix * fix * fix * nbinom2 works * ... * update * update * update * fix * fix * use performance remotes * add test for Gamma * update tests * fix * add * docs * fix * fix * fix * add tweedie * check if tweedie works * fix * addtest * fix test * ... * better warnings
@easystats/core-team FYI, I have largely revised |
See https://stackoverflow.com/a/78557174/190277
There's a warning thrown when computing (ultimately)
performance::r2_nakagawa()
ifexp(null.fixef)
is less than 6, which has something to do with the validity of the log-Normal approximation from Nakagawa et al. 2017. This test supposedly comes from me, but I think I stole it from somewhere else (see linked question for details). In the meantime,MuMIn
has much improved code for these approximations now (including exact, delta, log-normal, and trigamma approximations), which should possibly be used instead of what's there now ...The text was updated successfully, but these errors were encountered: