Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revisit logic for choosing number of threads in various FINUFFT tasks #439

Merged
merged 3 commits into from
May 13, 2024

Conversation

ahbarnett
Copy link
Collaborator

@ahbarnett ahbarnett commented May 8, 2024

material changes are

  • remove capping at omp_get_max_nthreads(), based on Marco & team discussion 5/7/24
  • warning user if they do the above; and if they try nthr>1 in OMP=OFF build
  • bin-sort max thread number now properly overridden by spopts.nthreads, and (with higher priority) by spopts.sort_threads (which also overrides the drop-to-single-thread heuristic)
  • no thread caps in spreadinterp; except if OMP=OFF now enforce nthr=1.

Here's detailed logic notes (also in gdoc):

OMP=OFF: (single-threaded). Always enforce nthr=1; warning raised (makeplan stage only) if user sets opts.nthreads>1.

OMP on: user can set any nonzero nthr; warning given if showwarn=1 & exceeds omp_max_threads().

At the makeplan stage, opts.nthreads is rewritten with actual # threads used (by fft, etc), thus it is never zero in downstream calls (setpts, exec). Spread_opts.nthreads is copied from opts.nthreads at this stage, so it is never 0.

FFT : uses opts.nthreads.

Spreadsorted and interpsorted: use spopts.nthreads (default 0 giving max_omp, but in FINUFFT context it is never zero; this would only occur in stand-alone testers for spreadinterp); unless OMP=OFF then use 1.

Bin sort (in spreadinterp.cpp): logic is bit trickier than spreadsorted or interpsorted. spopts.sort_threads (default 0 set in setup_spreader): if >0 acts as a high-priority override of sorting # threads (but overridden by OMP=OFF, in which case single-thread always used).

Otherwise, maxnthr is set by spopts.nthreads (or if 0, which never occurs in FINUFFT context, by querying omp_max…()), and then a heuristic is used to do single-thread or use maxnthr.

(Warnings are absent in spreadinterp.cpp since we don’t have a spread_opts.showwarn to switch them off, and the FINUFFT user will have been warned already). 

ahbarnett added 3 commits May 7, 2024 20:33
… ifndef _OPENMP hard nthr=1 override; warnings at the finufft level only; binsort controlled by spopts.sort_threads matching docs
….nthreads (which also does not override the heuristic, merely overrides omp as the max possible)
@ahbarnett ahbarnett requested a review from lu1and10 May 8, 2024 01:29
@ahbarnett
Copy link
Collaborator Author

ahbarnett commented May 8, 2024

@DiamonDinoia and @dfm you both might want to chime in on this. Dan, it removes the cap you'd kindly reminded us about. Reason being that sometimes you want to go above OMP-claimed max avail. Default behavior is same as before, of course. Should also be (even more!) impossible to crash when OMP=OFF than PR #431

Sufficiently complicated that maybe a flowchart needed? But logic is "hierarchical" in sense that spreadinterp (with spread_opts) forms a module that does something sensible. FINUFFT is a user of that module.

Copy link
Member

@lu1and10 lu1and10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks good to me, complicated but clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also respect opts.nthreads for this omp loop?it's not crucial, just being ocd

#pragma omp parallel for num_threads(maxnthr) schedule(static,1000000)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second this, just for consistency. It's the right time to do it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the heuristic sort_nthr = (10*M>N) ? maxnthr : 1;
should not apply to the simple multithreaded assignment of the identity permutation here, so I stand by what I had.

@ahbarnett ahbarnett requested review from blackwer and lu1and10 May 8, 2024 19:45
@DiamonDinoia
Copy link
Collaborator

I agree with the logic.

I think this check should be in a function somewhere as there is code duplication between the two places.

Copy link
Member

@blackwer blackwer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Approving, but Libin's suggestion would be nice to implement now.

@dfm
Copy link
Contributor

dfm commented May 9, 2024

This looks very sensible to me!

@ahbarnett ahbarnett merged commit 79de084 into master May 13, 2024
16 checks passed
@ahbarnett ahbarnett deleted the nthr_again branch June 5, 2024 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants