Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation says yogrt_remaining can only be called on rank 0 #15

Open
tepperly opened this issue Jul 31, 2023 · 3 comments
Open

Documentation says yogrt_remaining can only be called on rank 0 #15

tepperly opened this issue Jul 31, 2023 · 3 comments

Comments

@tepperly
Copy link
Member

The documentation says that yogrt can only be called on process 0. Presumably process means rank in an MPI context?

(This call may ONLY be used by process rank zero of a parallel job.)

It also says that it will return -1 when called by any process (rank?) not equal to 0.
Returns -1 when called from any task other than task zero.

On lassen, it appears that yogrt_remaining() returns reasonable positive values when called from rank 1.

@tepperly
Copy link
Member Author

I am not really asking you to remove functionality. We're just trying to write tests in LiDO based on the documented behavior. Perhaps, the best possible resolution would be a preprocessor macro (e.g., LIBYOGRT_WORKS_ON_ALL_RANKS) indicating whether yogrt_remaining() works on all ranks or only rank 0. This would allow us to adapt our tests to know whether to expect postive values on ranks not equal to zero or not.

@morrone
Copy link
Member

morrone commented Jul 31, 2023

lassen is using the lsf backend for remaining time. We could perhaps fix that backend to correctly return -1 on non-zero tasks. According to IBM documentation, there may be an environment variable LS_JOBPID that will tell us the process ID.

@morrone
Copy link
Member

morrone commented Jul 31, 2023

Presumably process means rank in an MPI context?

Technically, it is process rank 0 as determined by the job launcher. MPI bootstrap processes (almost?) always use this number to assign ranks in MPI_COMM_WORLD. libyogrt is not an MPI application, so it only knows what the launcher tells it, typically through environment variables. So it would entirely be possible for an alternate MPI communicator to not contain that process at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants