-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add leader fallback for worker file imports #5189
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good, but I think one of the new functions is missing a docstring, and I'm not sure about the clarity of some of the variable/function names.
src/toil/cwl/cwltoil.py
Outdated
) | ||
|
||
# files with a associated filesize that are valid to be imported on workers | ||
valid_files_to_data = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't really mapping to file data (bits in the file), right? This is metadata consisting of file size and... something.
Maybe this should be renamed and have a comment explaining what's actually in the values? Or an inline type hint and not just auto-typing?
Co-authored-by: Adam Novak <[email protected]>
src/toil/cwl/cwltoil.py
Outdated
files_to_data = get_file_sizes( | ||
filenames, toil._jobStore, include_remote_files=options.reference_inputs | ||
) | ||
|
||
# Mapping of files to metadata for files that will be imported on the worker | ||
# This will consist of files that we were able to get a file size for | ||
worker_files_to_data: dict[str, FileMetadata] = dict() | ||
# Mapping of files to metadata for files that will be imported on the leader | ||
# This will consist of files that we were not able to get a file size for | ||
leader_files_to_data = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't think we want to name these variables _to_data
. They don't contain file data.
We could maybe change files_to_data
, leader_files_to_data
, worker_files_to_data
to instead be metadata
, leader_metadata
, worker_metadata
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think metadata makes sense
Resolves #5135
Changelog Entry
To be copied to the draft changelog by merger:
Reviewer Checklist
issues/XXXX-fix-the-thing
in the Toil repo, or from an external repo.camelCase
that want to be insnake_case
.docs/running/{cliOptions,cwl,wdl}.rst
Merger Checklist