Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EAP efficiency #265

Open
eclare108213 opened this issue Nov 30, 2018 · 14 comments
Open

EAP efficiency #265

eclare108213 opened this issue Nov 30, 2018 · 14 comments
Labels

Comments

@eclare108213
Copy link
Contributor

EAP is running significantly slower than EVP in v6, even after efficiency improvements in #257 (originally #255). In v5.1.2, EAP TimeLoop was slower than EVP by roughly a factor of 2, now it is a factor of 3. The dynamics itself is showing nearly a factor of 10. This appears to be due to boundary updates, but it looks like both codes have the same calls. Based on the timers unrelated to EAP, the variation due to machine load is around +/-5%. These are 5-year gx1 QC configurations with the halo masks turned on.

EVP
conrad_intel_smoke_gx1_64x1_medium_qc.qc_evp

Timer 1: Total 5838.90 seconds
Timer 2: TimeLoop 5837.01 seconds
Timer 3: Dynamics 1501.73 seconds
Timer 4: Advection 1205.95 seconds
Timer 5: Column 2458.13 seconds
Timer 6: Thermo 2133.05 seconds
Timer 7: Shortwave 590.82 seconds
Timer 8: Ridging 325.01 seconds
Timer 9: Coupling 867.73 seconds
Timer 10: ReadWrite 647.31 seconds
Timer 11: Diags 744.80 seconds
Timer 12: History 467.37 seconds
Timer 13: Bound 3368.04 seconds
Timer 14: BGC 0.00 seconds

EAP
conrad_intel_smoke_gx1_64x1_medium_qc.qc_eap

Timer 1: Total 15796.37 seconds
Timer 2: TimeLoop 15788.08 seconds
Timer 3: Dynamics 11479.76 seconds
Timer 4: Advection 1171.17 seconds
Timer 5: Column 2402.69 seconds
Timer 6: Thermo 2136.43 seconds
Timer 7: Shortwave 583.50 seconds
Timer 8: Ridging 274.81 seconds
Timer 9: Coupling 913.23 seconds
Timer 10: ReadWrite 708.97 seconds
Timer 11: Diags 746.42 seconds
Timer 12: History 510.62 seconds
Timer 13: Bound 12242.84 seconds
Timer 14: BGC 0.00 seconds

@TillRasmussen
Copy link
Contributor

We can probably do similar improvements that was done with the evp scheme. Currently this is not top priority at DMI but the task should not be too big.

@apcraig
Copy link
Contributor

apcraig commented Dec 6, 2018

I have looked at RASM, comparing the current "colpkg" version we're running in production and the latest CICE6 version and see a significant slowdown there as well.

PFS.a93_a93_rx1.GNC2all 6.7 s/day => 10.7 s/day
PFS.a93_a93_rx1.GNC2skl 11.8 s/day => 16.2 s/day
PFS.w5a_a94.RBR 14.7 s/day => 24.4 s/day

Ice timer output for PFS.a93_a93_rx1.GNC2all is as follows for the colpkg and cice6 versions,

colpkg:
Timer 1: Total 46.10 seconds
Timer 2: TimeLoop 31.17 seconds
Timer 3: Dynamics 15.70 seconds
Timer 4: Advection 4.54 seconds
Timer 5: dEdd 0.00 seconds
Timer 6: Column 34.17 seconds
Timer 7: Thermo 9.91 seconds
Timer 8: Shortwave 62.39 seconds
Timer 9: Meltponds 0.00 seconds
Timer 10: Ridging 57.49 seconds
Timer 11: Cat Conv 0.00 seconds
Timer 12: Coupling 29.01 seconds
Timer 13: ReadWrite 0.00 seconds
Timer 14: Diags 1.21 seconds
Timer 15: History 0.85 seconds
Timer 16: Bound 27.78 seconds

cice6:
Timer 1: Total 74.40 seconds
Timer 2: TimeLoop 51.26 seconds
Timer 3: Dynamics 27.34 seconds
Timer 4: Advection 12.09 seconds
Timer 5: Column 11.49 seconds
Timer 6: Thermo 10.86 seconds
Timer 7: Shortwave 5.54 seconds
Timer 8: Ridging 0.78 seconds
Timer 9: Coupling 0.10 seconds
Timer 10: ReadWrite 0.00 seconds
Timer 11: Diags 1.80 seconds
Timer 12: History 0.81 seconds
Timer 13: Bound 47.23 seconds

The timer output in the two cases might not be apples and apples and there are obviously issues with internal load imbalance with regard to understanding where time is really spent, but it's the data we have at this point.

It is odd that the standalone model did not show a lot of performance degradation as we updated the implementation, but on aggregate, maybe there is.

@eclare108213
Copy link
Contributor Author

I don't think the timers in the colpkg run here are correct. For instance, Timer 6 (Column) should include all the times in Timers 7-11 (with overlap), and yet shortwave and ridging are each larger than the total column time. I think this problem was fixed in #98, so we'd need to compare timings after that or else go all the way back to v5.1.2. It would be good to get a handle on how the timings have changed since v5, and why.

@TillRasmussen
Copy link
Contributor

Is there a reason why stepa is only called on every 10th iteration?
Is it too expensive computational or is there a physical reasoning?

@eclare108213
Copy link
Contributor Author

I don't know. I'll ping Danny F's group and see if he/they can track it down.

@hheorton
Copy link

hheorton commented Sep 16, 2021

Hi everyone, I worked on updating the EAP rheology, originally developed/implemented by @mimi1981

I believe that the decision to update the structure tensor for only once in every 10 dynamics iterations was made by David and Michel.

I believe this decision was made for two reasons.

1 - The structure tensor will change at a much slower rate than the dynamics terms within the elastic solver due to the constraints set on the rate of change of structure tensor components (hard set parameters). Therefore the changes to the structure tensor need not be solved for at such a high time resolution.
2- They believed that having the structure tensor held constant for every 10 iterations will allow easier convergence for the rest of the ice dynamics.

I never investigated this part of the solver. However, I did read all the code many times, and the dynamics part of the code is much more computationally heavy than the structure tensor, so solving both at every time step won't add relatively that much computational time. Alternatively, the tensor solution time may be able to be reduced even further (every 20 dynamical steps for example).

It is of course possible that this difference in time steps may add strange numerical oscillations to the solutions, though I never noticed anything like that.

@TillRasmussen
Copy link
Contributor

Within the update_stress_rdg there seem to be many if statements. Some of these can be avoided by removing dead code.
Will we ever use interpolate_stress_rdg?
x, y, Tany_1 and Tany_2 are not used unless it is . This removes some if statements.

@hheorton
Copy link

Can you direct me to the code and branch you're reading?

@TillRasmussen
Copy link
Contributor

@hheorton Sorry this slipped my attention.
In the main branch of the current version of CICE
CICE/cicecore/cicedyn/dynamics/ice_dyn_eap.F90
subroutine update_stress_rdg
line ~1706 if (interpolate_stress_rdg) then
interpolate_stress_rdg is hard coded to false.

@hheorton
Copy link

@TillRasmussen , thank you for the update. DO you need me to check anything here? Or are you happy to go ahead and undo the hard coding and allow the interpolate option?

@TillRasmussen
Copy link
Contributor

TillRasmussen commented Jan 12, 2023

@hheorton I think that my main question is whether the interpolation is ever of need. If not the part of the code should be removed. I dont mind adding the namelist parameter.

@hheorton
Copy link

@TillRasmussen, well I always used it as it is much more accurate, particularly for high stress situations. Though I'm probably biased as I wrote the code! I'm not sure who else is actively using the EAP rheology at the moment so can't comment on current needs. It might be of particular use when trouble shooting model implementation as by interpolating and thus getting smoother solutions to the stress tensor one removes one possible source of numerical error.

@TillRasmussen
Copy link
Contributor

TillRasmussen commented Jan 13, 2023

The EAP is being used here and there. I will make a namelist parameter.
One motive for me is that I and a colleague is doing some refactorization of the EVP and I would like to do the same for the EAP. This includes removal of if statements. I may be back with more questions about other if statements

@hheorton
Copy link

Happy to have a look at the if statements. From my memory many of them are there to avoid 1/0 errors and atan2(0,0) situations in order to make the code work for various different compilers. These may be better achieved using different methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants