-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APPS-1271 Update improve manifest docs #354
Changes from all commits
0845cc4
34af2d0
2208d7c
e0e3092
57706e0
deceb59
8a6d1ed
51b1f24
9f7c951
0b7d5dd
202d759
01041f3
a9c259a
11a04da
d40849a
ee47654
11179e0
f49fa1d
9992e57
0fbedde
2c48a14
455de1d
18ee521
85e2ed0
96e9589
4402b17
7006883
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1441,24 +1441,51 @@ For an in depth discussion, please see [Missing Call Arguments](MissingCallArgum | |
|
||
# Manifests | ||
|
||
In extreme cases, running compiled workflows can fail due to DNAnexus platform limits on the total size of the input and output JSON documents of a job. An example is a task with many inputs/outputs that is called in scatter over a large collection. In such a case, you can enable manifest support at compile time with the `-useManifests` option. This option causes each generated applet or workflow to accept inputs as a manifest, and to produce outputs as a manifest. | ||
|
||
A manifest is a JSON document that contains all the inputs/outputs that would otherwise be passed directly to/from the applet. A manifest can be specified in one of two ways: via a JSON input, or via a File input (where the file must exist on the platform). | ||
In extreme cases, running compiled workflows can fail due to DNAnexus platform limits on the total size of the input and | ||
output JSON documents of a job. An example is a task with many inputs/outputs that is called in scatter over a large collection. | ||
In such a case, you can enable manifest support at compile time with the `-useManifests` option. | ||
This option causes each generated applet or workflow to accept inputs as an array of manifests, and to produce outputs as a single manifest. | ||
|
||
A manifest is a JSON document that contains all the inputs/outputs that would otherwise be passed directly to/from the | ||
Gvaihir marked this conversation as resolved.
Show resolved
Hide resolved
|
||
workflow stage. A manifest can be specified in one of two ways: | ||
1. A `.json` input file (see [Manifest JSON](#manifest-json)) is the recommended way to provide inputs in the manifest format. | ||
`java -jar dxCompiler.jar -inputs mymanifest.json` will produce `mymanifest.dx.json` that can be passed to `dx run -f mymanifest.dx.json`. | ||
2. A platform `file-xxx` with content described in [Intermediate manifest file inputs and outputs](#intermediate-manifest-file-inputs-and-outputs) | ||
section can be used to pass manifest output from a stage of one workflow (including the `output` stage) as input to another workflow. A | ||
typical use case for this scenario is when a user wants to pass manifest output file from a stage (including `output` stage) | ||
directly to a new workflow. Also, this scenario might be useful when debugging individual stages of a failing workflow. | ||
|
||
## Manifest JSON | ||
|
||
When manifest support is enabled, each applet has an `input_mainfest___` input field of type `hash`, which means that it accepts a JSON document as a string. For example, given the following workflow: | ||
When manifest support is enabled, applet/workflow outputs which are passed from one stage to another (or to the final output | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't want to inform the user of what the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not as simple. It does have this field, as well as a bunch of others, which I could not find the documentation for. APPS-1309 ticket should address that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gotcha. Thanks! |
||
stage) exist in the form of intermediate manifests. Here we describe the format of intermediate manifest for informational purposes only. | ||
There is no need to use them as your workflow inputs, as the JSON manifest above is the recommended format. | ||
For example, given the following workflow: | ||
|
||
```wdl | ||
version 1.1 | ||
|
||
task t1 { | ||
input { | ||
File f | ||
} | ||
command <<< | ||
echo "t1: " >>out | ||
cat "~{f}" >>out | ||
>>> | ||
output { | ||
File t1_out = "out" | ||
} | ||
} | ||
|
||
workflow test { | ||
Gvaihir marked this conversation as resolved.
Show resolved
Hide resolved
|
||
input { | ||
String s | ||
File f | ||
} | ||
... | ||
call t1 { input: f = f } | ||
output { | ||
Int i | ||
Pair[String, File] p | ||
File wf_out = t1.t1_out | ||
} | ||
} | ||
``` | ||
|
@@ -1470,62 +1497,62 @@ You would write the following manifest: | |
{ | ||
"test.input_manifest___": { | ||
Gvaihir marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"s": "hello", | ||
"f": "dx://file-xxx" | ||
"f": "dx://project-aaa:file-xxx" | ||
} | ||
} | ||
``` | ||
|
||
When you compile the workflow, provide the manifest using the `-inputs` option, and it will be translated to: | ||
Compile the workflow `test` from above with the `-inputs mymanifest.json` option. A new file `mymanifest.dx.json` will be | ||
created with the following content. **NOTE** `mymanifest.dx.json` is created by the compiler - the user does not need to | ||
create/change it manually. | ||
|
||
|
||
`mymanifest.dx.json` | ||
Gvaihir marked this conversation as resolved.
Show resolved
Hide resolved
|
||
```json | ||
{ | ||
"input_manifest___": { | ||
"s": "hello", | ||
"f": { | ||
"$dnanexus_link": "file-xxx" | ||
} | ||
}, | ||
"input_manifest___files": [ | ||
{ | ||
"$dnanexus_link": "file-xxx" | ||
"encoded": false, | ||
"types": { | ||
"f": "File", | ||
"s": "String" | ||
}, | ||
"values": { | ||
"s": "hello", | ||
"f": "dx://project-aaa:file-xxx" | ||
} | ||
] | ||
} | ||
} | ||
``` | ||
|
||
Finally, run your workflow using the translated input file: | ||
The created `mymanifest.dx.json` should be used as an input file when running the workflow: | ||
```commandline | ||
dx run workflow-yyy -f mymanifest.dx.json | ||
``` | ||
|
||
`dx run workflow-yyy -f mymanifest.dx.json` | ||
|
||
Gvaihir marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## Manifest file | ||
#### Intermediate manifest file inputs and outputs | ||
|
||
Manifest files are less convenient to use as applet/workflow inputs because they must be uploaded to the platform. However, when manifest support is enabled, applet/workflow outputs are in the form of manifest files, so it is useful to understand the format. | ||
When manifest support is enabled, applet/workflow outputs which are passed from one stage to another (or to the final output | ||
stage) exist in the form of intermediate manifests. Here we describe the format of intermediate manifest for informational purposes only. | ||
There is no need to use them as your workflow inputs, as the JSON manifest above is the recommended format. | ||
|
||
Given the above workflow, the manifest output would be: | ||
Given the above workflow, the manifest output from the `common` stage to the following stages (not shown) would be: | ||
|
||
```json | ||
{ | ||
"id": "test", | ||
"encoded": false, | ||
"id": "stage-common", | ||
"values": { | ||
"i": 1, | ||
"p": { | ||
"left": "hello", | ||
"right": { | ||
"$dnanexus_link": "file-xxx" | ||
} | ||
} | ||
"s": "hello", | ||
"f": "dx://project-aaa:file-xxx" | ||
} | ||
} | ||
``` | ||
|
||
The `id` field is optional but will always be populated in the output manfiests. The manifest may contain additional fields (`types` and `definitions`) that are only for internal use and can be ignored. | ||
|
||
To specify a manifest file as input to an applet or workflow, first upload the file to the platform and then pass it as input to the `input_manifest_files___` parameter: | ||
|
||
`dx run workflow-yyy -iinput_manifest_files___=file-zzz` | ||
|
||
Note that while `input_manifest_files___` is an array, you may only pass a single manifest file as input. | ||
The `id` field represents the ID of the stage which created the manifest output. It is optional but will always be | ||
populated in the output manifests. The manifest may contain additional `types` and `definitions` fields that are only | ||
for internal use and can be ignored. The outputs of the workflow are referenced in the `values` field of the output manifest | ||
in the form of a map, where keys are the names of the workflow outputs from the WDL `output` workflow section. | ||
|
||
## Analysis outputs | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a minor thing but it's easier to read (now and in the future) if there are no newlines introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and below wherever the breakline is added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this new line here looks ugly to show the difference in versions. In the raw source it looks fine and there's no change in the rendered variant. But without this break line - the source looks like one stretched line and you have to scroll sidewise to read it. So I would really insist on having those break lines in the markdown docs from now on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how it looks now:
dxCompiler/doc/ExpertOptions.md
Line 1444 in 455de1d
This is how the rest of the doc looks without new lines:
dxCompiler/doc/ExpertOptions.md
Line 1296 in 455de1d
To me second example is harder to read compared to my version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please click the links to actually see it in the raw .md