Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Provenance File Inconsistency #317

Open
lucpeterson opened this issue Jun 28, 2021 · 1 comment
Open

[BUG] Provenance File Inconsistency #317

lucpeterson opened this issue Jun 28, 2021 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@lucpeterson
Copy link
Member

🐛 Bug Report

Describe the bug
The provenance files can become inconsistent, since they are loaded/dumped via yaml. In particular, quotes around strings that exist in the original spec can be removed in the partial spec, which can then be misinterpreted when loaded for the final expanded spec.

I've put a simple example below (where somebody wants to name their queue "1" or "1.0") but I ran into this with the wall time syntax, b/c the "hours:minutes:seconds" if stripped of the quotes gets read in by pyYAML as base 60 and converted to seconds (ruamel doesn't do this, since it's been dropped from the yaml spec since 2009, but the bug is the same). This meant that the wall time was really broken because it was never getting passed in as the string hours:minutes:seconds and instead was always getting set as seconds, with potentially unknown consequences downstream.

To Reproduce
Steps to reproduce the behavior:

  1. Take any workflow with a task_queue (which expects a string)
  2. Rename the task queue to the "string" "1" or "1.0"
  3. merlin run --local --dry that workflow --> error
  4. compare the provenance specs in the merlin_info directory, "1" or "1.0" will have been changed to 1 or 1.0 and reloaded as an int or a float, not a string, which then breaks stuff.

Expected behavior
Strings should be preserved from orig -> partial -> expanded. Maybe we should consider using jinja or something to do a clean copy and replace of orig -> partial and then build the spec out of the partial, so we only have one read from yaml. Maybe a flow like

  1. Read the yaml file, record the variables
  2. Override any from the command line
  3. Resolve the internal variable dependencies (eg X1:"spam", X2:"cheese $(X1)" -> "cheese spam")
  4. Start from the original again and do a file copy with only the variables resolved
    I'm not quite sure what the current logic is

Please answer these questions to help us pinpoint the problem

  • Does the problem occur in merlin run --local mode, distributed mode or neither?

You can see it in --local

  • If a distributed problem, which backend and queue servers are you using? How are they configured?

N/A

  • On what machines/architectures are you running merlin? Is this bug on a specific machine or can you reproduce it elsewhere?

I can do this on my laptop

Please run merlin info and paste the results here:

Additional context
This is obviously a bit of a corner case, but it may also fix some complicated workflows where I've tried to use env variables to define parameters for make-samples and have gotten formatting messed up (eg my list "[0.0, 1.0]" gets modified to [0.0, 1.0] and then breaks the make-samples command. To get around it in production, I've had to define the list as ""[0.0, 1.0]"", which has been brittle. And now that I think about it, these are probably the same issue.

@lucpeterson lucpeterson added the bug Something isn't working label Jun 28, 2021
@ben-bay
Copy link
Contributor

ben-bay commented Jun 28, 2021

This reminds me of an error-checking feature I added to Maestro that never made it into Merlin because of the incompatibility. It's here. Basically it checked that yaml spec types were what we expected so it could catch errors early instead of letting the wrong variable type propagate into the code and break. Not directly related, but we should add this in anyway.

@AlexanderWinterLLNL I'm tagging you because I want you to see this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants