Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize GHC for larger datasets #510

Closed
juliohm opened this issue Feb 5, 2025 · 1 comment
Closed

Optimize GHC for larger datasets #510

juliohm opened this issue Feb 5, 2025 · 1 comment

Comments

@juliohm
Copy link
Member

juliohm commented Feb 5, 2025

The current implementation uses sparse matrices to handle large datasets with low range parameter, but preliminary steps still use dense matrices. We need to optimize the memory usage and then decide on a maximum number of samples above which we create a pipeline with sub-sampling and nearest-neighbor interpolation.

@juliohm
Copy link
Member Author

juliohm commented Feb 6, 2025

Fixed in GeoStatsTransforms.jl v0.10.2 with the introduction of a new nmax option. The option can be used to sub-sample the input geotable and avoid unbounded memory consumption. The result is then interpolated with nearest neighbors to the original domain.

@juliohm juliohm closed this as completed Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant