Skip to content

Commit

Permalink
Merge pull request #1 from CancerCollaboratory/feature/cwl1
Browse files Browse the repository at this point in the history
Feature/cwl1
  • Loading branch information
agduncan94 authored Sep 14, 2016
2 parents 9c8fb9b + b0826c1 commit bf66d35
Show file tree
Hide file tree
Showing 2 changed files with 148 additions and 167 deletions.
313 changes: 146 additions & 167 deletions Dockstore.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -2,225 +2,204 @@

class: CommandLineTool

description: |
Move annotations from one assembly to another

usage:
liftOver oldFile map.chain newFile unMapped
oldFile and newFile are in bed format by default, but can be in GFF and
maybe eventually others with the appropriate flags below.
The map.chain file has the old genome as the target and the new genome
as the query.

***********************************************************************
WARNING: liftOver was only designed to work between different
assemblies of the same organism. It may not do what you want
if you are lifting between different organisms. If there has
been a rearrangement in one of the species, the size of the
region being mapped may change dramatically after mapping.
***********************************************************************

dct:contributor:
foaf:name: Andy Yang
foaf:mbox: "mailto:ayang@oicr.on.ca"

foaf:mbox: mailto:[email protected]
dct:creator:
"@id": "http://orcid.org/0000-0001-9102-5681"
foaf:name: "Andrey Kartashov"
foaf:mbox: "mailto:Andrey.Kartashov@cchmc.org"

dct:description: "Developed at Cincinnati Children’s Hospital Medical Center for the CWL consortium http://commonwl.org/ Original URL: https://github.com/common-workflow-language/workflows"

cwlVersion: draft-3
'@id': http://orcid.org/0000-0001-9102-5681
foaf:name: Andrey Kartashov
foaf:mbox: mailto:[email protected]
dct:description: 'Developed at Cincinnati Children’s Hospital Medical Center for the
CWL consortium http://commonwl.org/ Original URL: https://github.com/common-workflow-language/workflows'
cwlVersion: v1.0

requirements:
- class: DockerRequirement
dockerPull: "quay.io/cancercollaboratory/dockstore-tool-liftover"

- class: DockerRequirement
dockerPull: quay.io/cancercollaboratory/dockstore-tool-liftover:1.0
inputs:
- id: "#oldFile"
type: File
inputBinding:
position: 2

- id: "#mapChain"
type: File
description: |
The map.chain file has the old genome as the target and the new genome
as the query.
inputBinding:
position: 3

- id: "#newFile"
type: string
inputBinding:
position: 4

- id: "#unMapped"
type: string
inputBinding:
position: 5

- id: "#gff"
type: ["null",boolean]
description: |
File is in gff/gtf format. Note that the gff lines are converted
separately. It would be good to have a separate check after this
that the lines that make up a gene model still make a plausible gene
after liftOver
genePred:
type: boolean?
inputBinding:
position: 1
prefix: "-gff"

- id: "#genePred"
type: ["null",boolean]
description: |
prefix: -genePred
doc: |
File is in genePred format
multiple:
type: boolean?
inputBinding:
position: 1
prefix: "-genePred"

- id: "#sample"
type: ["null",boolean]
description: |
File is in sample format
prefix: -multiple
doc: |
Allow multiple output regions
tab:
type: boolean?
inputBinding:
position: 1
prefix: "-sample"

- id: "#bedPlus"
type: ["null",int]
description: |
=N - File is bed N+ format
prefix: -tab
ends:
type: int?
inputBinding:
separate: false
position: 1
prefix: "-bedPlus="

- id: "#positions"
type: ["null",boolean]
description: |
prefix: -ends=
doc: |
=N - Lift the first and last N bases of each record and combine the
result. This is useful for lifting large regions like BAC end pairs.
positions:
type: boolean?
inputBinding:
position: 1
prefix: -positions
doc: |
File is in browser "position" format
chainTable:
type: string?
inputBinding:
position: 1
prefix: "-positions"

- id: "#hasBin"
type: ["null",boolean]
description: |
File has bin value (used only with -bedPlus)
prefix: -chainTable
doc: |
Min matching region size in query with -multiple.
fudgeThick:
type: boolean?
inputBinding:
position: 1
prefix: "-hasBin"

- id: "#minMatch"
type: ["null",int]
description: |
-minMatch=0.N Minimum ratio of bases that must remap. Default 0.95
prefix: -fudgeThick
doc: |
(bed 12 or 12+ only) If thickStart/thickEnd is not mapped,
use the closest mapped base. Recommended if using
-minBlocks.
hasBin:
type: boolean?
inputBinding:
separate: false
position: 1
prefix: "-minMatch="

- id: "#tab"
type: ["null",boolean]
prefix: -hasBin
doc: |
File has bin value (used only with -bedPlus)
minChainQ:
type: int?
inputBinding:
position: 1
prefix: "-tab"

- id: "#pslT"
type: ["null",boolean]
description: |
File is in psl format, map target side only
prefix: -minChainQ
doc: |
Minimum chain size in target/query, when mapping
to multiple output regions (default 0, 0)
sample:
type: boolean?
inputBinding:
position: 1
prefix: "-pslT"

- id: "#ends"
type: ["null",int]
description: |
=N - Lift the first and last N bases of each record and combine the
result. This is useful for lifting large regions like BAC end pairs.
prefix: -sample
doc: |
File is in sample format
minChainT:
type: int?
inputBinding:
separate: false
position: 1
prefix: "-ends="

- id: "#minBlocks"
type: ["null",int]
description: |
.N Minimum ratio of alignment blocks or exons that must map
(default 1.00)
prefix: -minChainT
doc: |
Minimum chain size in target/query, when mapping
to multiple output regions (default 0, 0)
minSizeQ:
type: int?
inputBinding:
separate: false
position: 1
prefix: "-minBlocks="
prefix: -minSizeQ
doc: |
Min matching region size in query with -multiple.
oldFile:
type: File
inputBinding:
position: 2

- id: "#fudgeThick"
type: ["null",boolean]
description: |
(bed 12 or 12+ only) If thickStart/thickEnd is not mapped,
use the closest mapped base. Recommended if using
-minBlocks.
unMapped:
type: string
inputBinding:
position: 1
prefix: "-fudgeThick"
position: 5

- id: "#multiple"
type: ["null",boolean]
description: |
Allow multiple output regions
mapChain:
type: File
inputBinding:
position: 1
prefix: "-multiple"
position: 3

- id: "#minChainT"
type: ["null",int]
description: |
Minimum chain size in target/query, when mapping
to multiple output regions (default 0, 0)
doc: |
The map.chain file has the old genome as the target and the new genome
as the query.
pslT:
type: boolean?
inputBinding:
position: 1
prefix: "-minChainT"

- id: "#minChainQ"
type: ["null",int]
description: |
Minimum chain size in target/query, when mapping
to multiple output regions (default 0, 0)
prefix: -pslT
doc: |
File is in psl format, map target side only
minMatch:
type: int?
inputBinding:
separate: false
position: 1
prefix: "-minChainQ"

- id: "#minSizeQ"
type: ["null",int]
description: |
Min matching region size in query with -multiple.
prefix: -minMatch=
doc: |
-minMatch=0.N Minimum ratio of bases that must remap. Default 0.95
gff:
type: boolean?
inputBinding:
position: 1
prefix: "-minSizeQ"

- id: "#chainTable"
type: ["null",string]
description: |
Min matching region size in query with -multiple.
prefix: -gff
doc: |
File is in gff/gtf format. Note that the gff lines are converted
separately. It would be good to have a separate check after this
that the lines that make up a gene model still make a plausible gene
after liftOver
minBlocks:
type: int?
inputBinding:
separate: false
position: 1
prefix: "-chainTable"

prefix: -minBlocks=
doc: |
.N Minimum ratio of alignment blocks or exons that must map
(default 1.00)
bedPlus:
type: int?
inputBinding:
separate: false
position: 1
prefix: -bedPlus=
doc: |
=N - File is bed N+ format
newFile:
type: string
inputBinding:
position: 4

outputs:
- id: "#output"
output:
type: File
description: "The sorted file"
outputBinding:
glob: $(inputs.newFile)

- id: "#unMappedFile"
doc: The sorted file
unMappedFile:
type: File
description: "The sorted file"
outputBinding:
glob: $(inputs.unMapped)


baseCommand: ["liftOver"]
doc: The sorted file
baseCommand: [liftOver]
doc: |
Move annotations from one assembly to another

usage:
liftOver oldFile map.chain newFile unMapped
oldFile and newFile are in bed format by default, but can be in GFF and
maybe eventually others with the appropriate flags below.
The map.chain file has the old genome as the target and the new genome
as the query.

***********************************************************************
WARNING: liftOver was only designed to work between different
assemblies of the same organism. It may not do what you want
if you are lifting between different organisms. If there has
been a rearrangement in one of the species, the size of the
region being mapped may change dramatically after mapping.
***********************************************************************

2 changes: 2 additions & 0 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,5 @@ cwltool (1.0.20160712154127)
schema-salad (1.14.20160708181155)
setuptools (25.1.6)
```

Tested on Dockstore version 1.0-rc.2

0 comments on commit bf66d35

Please sign in to comment.