Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add incremental design matrix building #192

Open
zmbc opened this issue Oct 28, 2024 · 1 comment
Open

Add incremental design matrix building #192

zmbc opened this issue Oct 28, 2024 · 1 comment

Comments

@zmbc
Copy link

zmbc commented Oct 28, 2024

Apologies if this already exists, but I could not find an equivalent of patsy's incr_dbuilder (more info). This would be useful for several reasons, one of them being for parallelizing the actual building across cores.

@matthewwardrop
Copy link
Owner

Hi @zmbc ! Unfortunately, formulaic does not support incremental building; but in principle it would not be too difficult to implement. The equivalent data structure for DesignInfo in formulaic is the ModelSpec. If you know the categories in advance, you could construct the ModelSpec yourself.

You could also construct your ModelSpec on a subset of the data that is guaranteed to have all categories represented, and then just apply the ModelSpec in parallel. Since ModelSpec is guaranteed to be serializable, this is easy to pass around between process/etc.

The only thing I think that is missing is the original scan of the data to make sure than formulaic detects all categories... is that right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants