-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
De-couple file upload/download from scheduler communication methods #5084
Comments
I think it would generally be a good idea to conceptually separate these two tasks. What is your estimate as to how much refactoring and restructuring that would require? Also, it might be worthwhile to concretize this in an AEP. |
For the fireworks scheduler, the simple solution for this would be to expose the At the moment, the scheduler had to download the submission script from the remote, parse it and submit the job to the launchpad server. Below is the line for the current implementation: |
Well let's say for now, I think it's plausible, but certainly not trivial. At least a few weeks of labour, so not likely to happen in the short-term 😬 |
Thanks @chrisjsewell for starting this issue (and @zhubonan for your presentation on the aiida-fireworks-scheduler!) Here a few thoughts:
|
So, here is what I think is a simple way to implement this (at least to solve the issue of @zhubonan where, during the submission, he needs to SSH to the scheduler just to fetch back the script, and he has to parse it back):
If you agree that this is a good approach, and @zhubonan confirms that this would solve his problem (move his code from |
@giovannipizzi Thanks, I think what you suggest should work well! - This would also make the plugin code more concise as there is no need to do a round-trip of generating a "fake" script during upload and parse it back in for submit I would like to add that the firework scheduler still uses the transport object attached for getting the computer and username as the identifiers: so one won't get jobs of other machines/other accounts in the same machine (it is possible that the user can have two accounts of the same machine, created as two separate E.g. these two lines below should be kept: aiida-core/aiida/engine/daemon/execmanager.py Lines 328 to 329 in 287d138
If we want the scheduler to work without the |
Note I'll probably be looking to do this, in conjunction with https://github.com/aiidateam/aiida-firecrest |
I'm adding a comment to remember, when redesigning this, to take (at least) all the following use cases into account:
|
A few things learnt from aiida-hyperqueue (@mbercx edit this comment if I'm forgetting something), see also aiidateam/aiida-hyperqueue#2
|
Currently, aiida uses the same Transport plugin (e.g. direct or ssh) for uploading/downloading files to/from the Computer, as it does for communicating with a Scheduler via command executions (e.g. to poll for completed jobs).
There are potential use cases though where we may what separate methods to achieve these tasks; one example being that SLURM now has a REST API (https://slurm.schedmd.com/rest.html), and so maybe you want to upload files with SSH, then use the REST API to control SLURM.
This also came out of the meeting with @zhubonan, regarding a Fireworks scheduler
cc also @csadorf @giovannipizzi, correct me if I'm wrong in my takeaway?
The text was updated successfully, but these errors were encountered: