Implement GPU Acceleration & Performance Cues for UTCI Polynomial Calculation #382

noamjgal · 2025-02-19T08:58:36Z

I'm working with large point clouds (30M+ vertices) running UTCI calculations. As someone new to both this codebase and open source contribution in general, I've experienced a performance progression that many users might encounter:

Started with the base implementation (utci.py), using list comprehension which took ~10 minutes for my dataset
Discovered and switched to the NumPy-vectorized version (map/utci.py), reducing runtime to ~10 seconds
Now proposing a GPU-accelerated implementation that could potentially reduce this to ~1 second

Experiencing this learning curve is leading me to suggest both further performance improvements and better discoverability of optimized implementations.

Proposed approach

Add GPU acceleration as an optional feature supporting CUDA and MPS with CPU fallback
Vectorize polynomial evaluation using PyTorch
Preserve exact polynomial coefficients and calculation logic across implementations

Implementation strategy

The codebase would maintain all three implementations to support different use cases:

Base Python: Simple, no dependencies, good for small datasets
NumPy-vectorized: Excellent for large datasets
GPU-accelerated: Optimal for very large datasets

Key considerations

Package dependencies: Adding PyTorch as an optional dependency
Performance thresholds: Document dataset sizes where each implementation becomes optimal
Memory overhead: Evaluate CPU-GPU transfer costs for different dataset sizes
Testing: Ensure numerical consistency across all implementations
Documentation: Clear guidance on choosing the appropriate implementation based on use case

As this is my first potential contribution, I would greatly appreciate guidance.

chriswmackey · 2025-02-19T21:31:43Z

Thanks for the proposal, @noamjgal .

My first thought when I read the title was that you were proposing the use of GPU-based ray-tracing for the computation of shortwave solar MRT for outdoor comfort mapping. For that, we have some workflows we recommend with Accelerad. But I see now that you are proposing running the UTCI calculation itself using GPU (assuming that you already have MRT and the other inputs ready to go).

I haven't personally run into this as a bottleneck yet given all of the other parts of my typical UTCI mapping workflows. But, if GPU acceleration for UTCI calculation is something that will make your life better, then we would welcome the contribution.

My only thought is whether this might be better implemented as an extension package for ladybug-comfort rather than an [extra] on the base ladybug-comfort package like the numpy integration. @mikkelkp did the implementation of the numpy capability so I think he's better-equipped to advise on implementation details like that.

As long as the implementation does not interfere with loading the base comfort function modules or the parameter, collection, and chart subpackages in all of the Python environments we use them in (cPython 3.7-3.12 and IronPython 2.7), I support any implementation that is made. You can knock yourselves out with adding in more complex dependencies to the map and cli subpackages, though I would appreciate graceful handling of things if the deps aren't found.

mostaphaRoudsari assigned chriswmackey and mikkelkp Feb 19, 2025

chriswmackey added new development For issues that require new code wish New feature or request which is not critical to continued development at this point labels Feb 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement GPU Acceleration & Performance Cues for UTCI Polynomial Calculation #382

Implement GPU Acceleration & Performance Cues for UTCI Polynomial Calculation #382

noamjgal commented Feb 19, 2025

chriswmackey commented Feb 19, 2025

Implement GPU Acceleration & Performance Cues for UTCI Polynomial Calculation #382

Implement GPU Acceleration & Performance Cues for UTCI Polynomial Calculation #382

Comments

noamjgal commented Feb 19, 2025

Proposed approach

Implementation strategy

Key considerations

chriswmackey commented Feb 19, 2025