Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance v0.11.0 #1015

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

datejada
Copy link
Member

@datejada datejada commented Feb 5, 2025

  • Using JuMP.add_to_expression! and subset when creating the constraints over clustered year (inter-temporal constraints) to improve efficiency.

Related issues

Closes #1014

Checklist

  • I am following the contributing guidelines

  • Tests are passing

  • Lint workflow is passing

  • Docs were updated and workflow is passing

@datejada datejada added the benchmark PR only - Run benchmark on PR label Feb 5, 2025
Copy link
Contributor

github-actions bot commented Feb 5, 2025

Benchmark Results

Benchmark in progress...

Copy link

codecov bot commented Feb 5, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.24%. Comparing base (d7cba2a) to head (c99898e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1015      +/-   ##
==========================================
+ Coverage   95.22%   95.24%   +0.01%     
==========================================
  Files          29       29              
  Lines        1152     1156       +4     
==========================================
+ Hits         1097     1101       +4     
  Misses         55       55              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@datejada datejada force-pushed the 1014-improve-performance-v0_11_0 branch from 48bfb7a to f3cda08 Compare February 5, 2025 11:47
@abelsiqueira
Copy link
Member

I don't know what is the error in the benchmark, but it doesn't seem related to the cancellation due to exceeded time. Does it run the benchmark locally?

@datejada datejada force-pushed the 1014-improve-performance-v0_11_0 branch from f3cda08 to c99898e Compare February 5, 2025 14:24
@datejada
Copy link
Member Author

datejada commented Feb 5, 2025

I don't know what is the error in the benchmark, but it doesn't seem related to the cancellation due to exceeded time. Does it run the benchmark locally?

I forgot to update the duration for the incoming inflows. I fixed and it runs locally.

The results in v0.10.4 are here: #1013 (comment)

I expect this Benchmark to be around the same time, which is good since we achieve more or less the same performance we got in that version, but...20min to create the model is too much :/ Especially since the hourly problem of the whole year (a bigger problem) takes around 5 minutes to create. The bottleneck is this function add_expression_terms_over_clustered_year_constraints! which is too slow...I wonder what else we can do to improve its performance...any ideas?

@datejada datejada requested a review from abelsiqueira February 5, 2025 14:34
@abelsiqueira
Copy link
Member

That's the counterpart of https://github.com/TulipaEnergy/TulipaEnergyModel.jl/pull/987/files#diff-43e9d15ce9a457e188412a614efc15efee4860e85525d1b47ef2892074e90d43, right? Hopefully something similar to what was done there, i.e., change the loops to use DuckDB tables and explicit mark some parts with types. I used Cthulhu and @profview while I was doing that PR to help figure out where to change things.
However, it will be more complicated since there is also a df_map, so it's not immediately clear how to change things. I can start looking into this, to figure out more

@datejada
Copy link
Member Author

datejada commented Feb 5, 2025

  • JuMP.add_to_expression!

So, to recap (for myself and having things clear), we have identified two bottlenecks:

  1. The +=, which is solved using JuMP.add_to_expression! Yesterday, Joaquim and Oscar commented on the JuMP energy models benchmark meeting, explaining why this += is slow and reinforcing that we improve efficiency using JuMP.add_to_expression!
  2. The filtering/subsetting of the dataframes, which might improve with a DuckDB version as the representative period's counterpart.

The first point leaves us at a similar performance level to v0.10.4 (which is still slow for the use cases we are solving now).

The second point is challenging, but it feels the way to go. I will try to think about it, too, but if you have time between the multi-year refactor and this, it would be appreciated.

Thanks!

@abelsiqueira
Copy link
Member

  1. Yes
  2. Hopefully, yes. Either way, part of the refactor involves this function, so I changing to DuckDB will be necessary. The rep_period counterpart was done more on the objective of refactoring, and some investigation led to it being a bit faster. This might be the case here as well.

I will try to spend some time on it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark PR only - Run benchmark on PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve performance v0.11.0
2 participants