Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Crossed random effects for multilevel data #57

Open
jmgirard opened this issue Apr 13, 2022 · 2 comments
Open

Feature Request: Crossed random effects for multilevel data #57

jmgirard opened this issue Apr 13, 2022 · 2 comments
Labels
feature a feature request or enhancement

Comments

@jmgirard
Copy link

In tidymodels/TMwR#288, I suggested adding the feature of having {tidyposterior} detect multilevel data (e.g., from rsample::group_vfold_cv()) and add random effects for the grouping/cluster variable to the model comparison analysis. This would usually be a cross-classified model with random effects (e.g., intercepts) for both resample and group. Accounting for this extra level of dependency in the data is important to properly estimating uncertainty.

I will work on posting some example code below (but am running low on time today).

@juliasilge juliasilge added the feature a feature request or enhancement label Apr 13, 2022
@topepo
Copy link
Member

topepo commented Nov 1, 2023

Can you provide an example? Without our resampling tools, I'm not sure how the resampling statistics will have a hierarchical structure to them.

There is a formula argument where you can pass something custom.

@jmgirard
Copy link
Author

jmgirard commented Nov 2, 2023

The two main examples would be longitudinal and clustered data. If you have multiple observations per person and are trying to make observation-level predictions, then your estimates of resampling statistic uncertainty will be biased to the extent that predictive performance is more similar within persons (e.g., some people are "easier" to predict than others). Similarly, if you have multiple students from the same classroom and are trying to make student-level predictions, then your estimates of resampling statistic uncertainty will be biased to the extent that predictive performance is more similar within classrooms (e.g., some classrooms are "easier" to predict than others).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants