Optimize `GHC` for larger datasets #510

juliohm · 2025-02-05T18:23:03Z

The current implementation uses sparse matrices to handle large datasets with low range parameter, but preliminary steps still use dense matrices. We need to optimize the memory usage and then decide on a maximum number of samples above which we create a pipeline with sub-sampling and nearest-neighbor interpolation.

juliohm · 2025-02-06T12:48:45Z

Fixed in GeoStatsTransforms.jl v0.10.2 with the introduction of a new nmax option. The option can be used to sub-sample the input geotable and avoid unbounded memory consumption. The result is then interpolated with nearest neighbors to the original domain.

juliohm added enhancement help wanted performance labels Feb 5, 2025

juliohm closed this as completed Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `GHC` for larger datasets #510

Optimize `GHC` for larger datasets #510

juliohm commented Feb 5, 2025

juliohm commented Feb 6, 2025

Optimize GHC for larger datasets #510

Optimize GHC for larger datasets #510

Comments

juliohm commented Feb 5, 2025

juliohm commented Feb 6, 2025

Optimize `GHC` for larger datasets #510

Optimize `GHC` for larger datasets #510