-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with pglmm Bayes=TRUE model syntax #73
Comments
It seems that @rdinnager is working on this. @Flaiba if you don't need random terms, I think the |
Hi @Flaiba , Thanks for the detailed bug report. I believe the problem here is that your formula is not getting translated into an
Let me know if the above does not work. |
Hi daijiang and rdinnager On the other hand, I think the problem is as you said rdinnager. Unfortunately, I used your code, but I still have the same problem. "Error in formula.character(object, env = baseenv()) : The response variable is not detected. :( Thank you both very much. Packages like this are essential to making better models and the possibility of asking them for help is an unbeatable chance to continue learning. Thanks. |
Hi @Flaiba , Thanks for giving that a try. After the error occurs, could you run |
Hi! first of all, I apologize for not attaching the phylogenetic tree. M2=pglmm(cbind(Neoplasia, knownDeaths-Neoplasia) ~ log(BodyMass)+log(lifeexp)+ (1|OLRE), data = data, family = "zeroinflated.binomial",cov_ranef = list(sp = phy),add.obs.re = FALSE,bayes=TRUE,verbose=TRUE) data$fails <- data$knownDeaths - data$Neoplasia Error in formula.character(object, env = baseenv()) : and Finally, I tried used the ratio of Neoplasia/KnownDeadths (CMR), but the problem is with the INLA Error in inla.inlaprogram.has.crashed() : Thanks again, I hope that with the information from the phylogenetic tree you can replicate this problem, and finally get a solution. |
fixes #73, issue with translating formula
Hi @Flaiba. I've now made a patch which I think should fix your issue. To try it just install the development version of |
I am very grateful for your help. Both are very helpful. I can run this model perfectly.
However, Can I use this space to ask two things related to this model? Firstly, I saw in this blog/help (https://daijiang.github.io/phyr/articles/phyr_example_empirical.html) for the package that the models can be checked using the Dharma package. In the case of this model, perhaps after this actualization, I can not check the model by Dharma. simulationOut<- simulateResiduals(fittedModel = M1, plot = FALSE)
Additionally, and I promise this is the last asked. Thanks, thanks a lot for your time |
Hmm.. I'm not surprised the zero inflated model didn't work with
We intent to add more user friendly options for doing this sort of thing in the future, but at the moment it might require manually extracting useful quantities from the model object. Supporting How did you determine that the categorical variable was 'significant'? Were you using the maximum likelihood or Bayesian methods for this? |
In the mean-time, you may want to look into the |
Sorry for the delay in responding, I had been away for the last few days. It will be great if the validation of models works employing Dharma. In the same way, if phyr objects can be used in other packages like emmeans, they can help users gravitate towards using these tools. However, despite the improvements that may be implemented in the future, I consider that this package is an excellent tool for making complex models, providing a better approximation regarding the options available so far. Once again, thanks for the development and help provided to end users. I will follow your suggestions. |
Hi, I am following up on this issue, and I am wondering if the zero-inflated model can now be examined with DHARMa. I am doing a PGLMM using the Bayesian approach with zero.inflated data and having two issues.
Thanks for your help , |
Hi @jenmunoz . Unfortunately I still haven't had a chance to look into getting the zero-inflated model working with DHARMa, but it would definitely be good to get this functionality. Thank you also for the bug report for plot_bayes() with the zero inflated model as well. I won't have time to look at this issue this week, but I can get to it next week. To expedite the process, It would be very helpful if you were able to make a minimal reproducible example for this issue (perhaps with simulated data)? |
Hi @rdinnager, The issue with DHARMa not working seems to apply to all pglmm with bayes=TRUE. For instance, if I try to just create posterior predictive simulations and feed them to createDHARMa does not work because the function phyr::simulate.communityPGLMM is not working for bayes=TRUE models. Thanks for your reply, here is a reproducible example. DATAdata_df<-read.csv("7.df_data_example.csv", na.strings =c("","NA")) %>% na.omit() STRUCTUREectos_df$species_jetz<-as.factor(ectos_df$species_jetz) MODELpglmm_bayes <-phyr::pglmm(total_parasites~sociality+scale(elevation)+(1|species_jetz__)+(1|Powder.lvl), ASSUMPTIONS CHECK ( )simulationOutput<-DHARMa::simulateResiduals(fittedModel= pglmm_bayes , plot= TRUE, re.form = NULL ) #quantreg=T #PLOT Error in Caused by error in Create new data ( simulate)simulate.communityPGLMM(pglmm_bayes, nsim = 10, seed = NULL, re.form = NULL) Links to data |
Thank you for the example and the additional investigation of the bug. Much appreciated. I will try and get to a fix as soon as I can. |
hello phyr team.
I am interested in replicating the analysis carried out by Vincze et al. 2022 (https://doi.org/10.1038/s41586-021-04224-5), employing other explicatory variables using the library phyr since this package might allow better analysis. For my question, I used the data frame that these authors make available (https://github.com/OrsolyaVincze/VinczeEtal2021Nature/blob/main/SupplementaryData.xls).
The work carried out by Vincze et al. (2022) studied a simple measure of cancer mortality risk (CMR) in mammals in the relationship with variables like body size or lifespan. Addionatilly, it was necessary to incorporate phylogenetic information due to the lack of independence between the species analyzed.
The data frame contains:
Variable Response (CMR) = the ratio between the number of cancer-related deaths (Neoplasia) and the total number of individuals whose postmortem pathological records were entered in the database (knownDeaths).
Like a first approach, we think of the following model:
(cbind(Neoplasia, knownDeaths-Neoplasia) ~ log(BodyMass)+log(lifeexp), data = data, family = "binomial",cov_ranef = list(sp = phy),add.obs.re = FALSE)
However, when trying to make this model, we find some inconveniences.
First of all, I would like to ask to confirm my suspicions: the phyr library does not allow models to be made without any random variable,? Since I did not find any library that allows me to run a binomial model for proportions considering the phylogenetic information and not including any random variable. I ask this because it is possible that a binomial model (for proportions) does not have overdispersion, so in this case for these data I would not need to use mixed models.
However, it is known that binomial distribution models tend to suffer from overdispersion, so we decided to incorporate a random effects variable (OLRE) to take into account overdispersion. In this way, we can use the library without any inconvenience, using the following model:
M1=pglmm(cbind(Neoplasia, knownDeaths-Neoplasia) ~ log(BodyMass)+log(lifeexp)+ (1|OLRE), data = data, family = "binomial",cov_ranef = list(sp = phy),add.obs.re = FALSE)
However, this model shows a bad fit and overdispersion caused by an excess of 0.
So, as a next step, we decided to implement zero-inflated models that this library would allow us to incorporate, using the following model:
M2=pglmm(cbind(Neoplasia, knownDeaths-Neoplasia) ~ log(BodyMass)+log(lifeexp)+ (1|OLRE), data = data, family = "zeroinflated.binomial",cov_ranef = list(sp = phy),add.obs.re = FALSE,bayes=TRUE,verbose=TRUE), but.....
Consider entering the response variable as the proportion (CMR), but according to inla I have to enter it as an interger, but my problem is that I have no way to define the weights or the Ntrial for that proportion.
I admit that I have problems with the syntax of my model in relation to the response variable.
Could someone help me with the syntax to be able to run the model I need?
Thank you very much for your time and help.
Nicolas, thanks, thanks, thankssssssss........
Attached the database and the script used
`library(ggplot2)
library(ape)
library(car)
library(phytools)
library(phylolm) # phyloglm
library(phyr)
library(DHARMa)
library(dplyr)
str(data)
data$OLRE=as.factor(data$OLRE)
data$Species=as.factor(data$Species)
data$order=as.factor(data$order)
Species-specific body mass
data$BodyMass <- (data$MaleMeanMass + data$FemaleMeanMass)/2
phylogeny
phy <- read.nexus("consensus_phylogeny.tre") # consensus vertlife tree
phy <- bind.tip(phy, "Cervus_canadensis", where = which(phy$tip.label=="Cervus_elaphus"),
edge.length=0.5, position = 0.5)
phy <- bind.tip(phy, "Gazella_marica", where = which(phy$tip.label=="Gazella_subgutturosa"),
edge.length=0.5, position = 0.5)
#firts model
M0=pglmm(cbind(Neoplasia, Nodeadbycancer) ~ log(BodyMass), data = data, family = 'binomial',cov_ranef = list(sp = phy))
No random terms specified, use lm or glm instead, There is a possibility that running a model without random effects?
M1=pglmm(cbind(Neoplasia, knownDeaths-Neoplasia) ~ log(BodyMass)+log(lifeexp)+ (1|OLRE), data = data, family = "binomial",cov_ranef = list(sp = phy),add.obs.re = FALSE)
summary(M1)
simulationOut<- simulateResiduals(fittedModel = M1, n = 250,refit = FALSE)
plot(simulationOut) # bad fit
testZeroInflation(simulationOut) # overdispersion by zeros
testOverdispersion(simulationOut) # overdispersion by not zeros is good
M2=pglmm(cbind(Neoplasia, knownDeaths-Neoplasia) ~ log(BodyMass)+log(lifeexp)+ (1|OLRE), data = data, family = " zeroinflated.binomial",cov_ranef = list(sp = phy),add.obs.re = FALSE,bayes=TRUE)
`
dataIZ.txt
The text was updated successfully, but these errors were encountered: