Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to connect cromwell server.. unsupported backened: sge #200

Open
ambeys opened this issue Nov 18, 2020 · 3 comments
Open

Failed to connect cromwell server.. unsupported backened: sge #200

ambeys opened this issue Nov 18, 2020 · 3 comments

Comments

@ambeys
Copy link

ambeys commented Nov 18, 2020

Describe the bug

I am trying to run caper for chip-seq pipeline for Transcription factor. The dry run shows that caper run is fine. But when trying with the example file or my own input.json file it is giving me error failed to connect Cromwell server and in crowwell.out it shows unsupported backend. SGE mode starts a job properly, then fails, I guess there is certain issue with configuration with the age for Cromwell. May be we have to add some options in order to work correctly in our cluster. I tried it in mac and My colleague did try on window platform as well. and report the same error. i went through all the documentation and tried to fix with different ways but somehow it is not running. It has been since 3 weeks that we are stuck in this.

OS/Platform

  • OS/Platform: [e.g. MacOS-catalina (10.15.7),she].
  • module load anaconda2/4.3.1
    -$ conda --version = conda 4.6.7.
  • Pipeline version: [e.g. v1.6.0]
  • Caper version: [e.g. v1.1.0]

Caper configuration file

Paste contents of /home/ambey/.caper/default.conf.

backend=sge
sge-pe=openmp

# Hashing strategy for call-caching (3 choices)
# This parameter is for local (local/slurm/sge/pbs) backend only.
# This is important for call-caching,
# which means re-using outputs from previous/failed workflows.
# Cache will miss if different strategy is used.
# "file" method has been default for all old versions of Caper<1.0.
# "path+modtime" is a new default for Caper>=1.0,
#   file: use md5sum hash (slow).
#   path: use path.
#   path+modtime: use path and modification time.
local-hash-strat=path+modtime

# Local directory for localized files and Cromwell's intermediate files
# If not defined, Caper will make .caper_tmp/ on local-out-dir or CWD.
# /tmp is not recommended here since Caper store all localized data files
# on this directory (e.g. input FASTQs defined as URLs in input JSON).
local-loc-dir=/home/ambey/mnt/Genoma/amedina/ambey/CAPER/.caper_tmp
local_out_dir=/home/ambey/mnt/Genoma/amedina/ambey/CAPER/caper_out

Input JSON file

I am trying the example json file: chip-seq-pipeline2/example_input_json/ENCSR000DYI_subsampled_chr19_only.json
{
    "chip.pipeline_type" : "tf",
    "chip.genome_tsv" : "https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v3/hg38_chr19_chrM.tsv",
    "chip.fastqs_rep1_R1" : ["https://storage.googleapis.com/encode-pipeline-test-samples/encode-chip-seq-pipeline/ENCSR000DYI/fastq_subsampled/rep1.subsampled.25.fastq.gz"
    ],
    "chip.fastqs_rep2_R1" : ["https://storage.googleapis.com/encode-pipeline-test-samples/encode-chip-seq-pipeline/ENCSR000DYI/fastq_subsampled/rep2.subsampled.20.fastq.gz"
    ],
    "chip.ctl_fastqs_rep1_R1" : ["https://storage.googleapis.com/encode-pipeline-test-samples/encode-chip-seq-pipeline/ENCSR000DYI/fastq_subsampled/ctl1.subsampled.25.fastq.gz"
    ],
    "chip.ctl_fastqs_rep2_R1" : ["https://storage.googleapis.com/encode-pipeline-test-samples/encode-chip-seq-pipeline/ENCSR000DYI/fastq_subsampled/ctl2.subsampled.25.fastq.gz"
    ],
    "chip.paired_end" : false,
    "chip.title" : "ENCSR000DYI (subsampled 1/25, chr19_chrM only)",
    "chip.description" : "CEBPB ChIP-seq on human A549 produced by the Snyder lab"
}

Command used

$ qlogin
$ module load anaconda2/4.3.1
$ source activate encode-chip-seq-pipeline
$ caper init sge
# modified `/home/ambey/.caper/default.conf`
$ caper run /mnt/Genoma/amedina/ambey/chip-seq-pipeline2/chip.wdl -i /mnt/Genoma/amedina/ambey/chip-seq-pipeline2/example_input_json/ENCSR000DYI_subsampled_chr19_only.json

Troubleshooting result

$ caper troubleshoot

2020-11-18 15:05:33,428|caper.server_heartbeat|ERROR| Found a heartbeat file but it has been expired (> timeout). ~/.caper/default_server_heartbeat
Traceback (most recent call last):
  File "/cm/shared/apps/anaconda2/4.3.1/envs/encode-chip-seq-pipeline/bin/caper", line 13, in <module>
    main()
  File "/cm/shared/apps/anaconda2/4.3.1/envs/encode-chip-seq-pipeline/lib/python3.6/site-packages/caper/cli.py", line 504, in main
    client(parsed_args)
  File "/cm/shared/apps/anaconda2/4.3.1/envs/encode-chip-seq-pipeline/lib/python3.6/site-packages/caper/cli.py", line 269, in client
    subcmd_troubleshoot(c, args)
  File "/cm/shared/apps/anaconda2/4.3.1/envs/encode-chip-seq-pipeline/lib/python3.6/site-packages/caper/cli.py", line 454, in subcmd_troubleshoot
    wf_ids_or_labels=args.wf_id_or_label, embed_subworkflow=True
  File "/cm/shared/apps/anaconda2/4.3.1/envs/encode-chip-seq-pipeline/lib/python3.6/site-packages/caper/caper_client.py", line 129, in metadata
    embed_subworkflow=embed_subworkflow,
  File "/cm/shared/apps/anaconda2/4.3.1/envs/encode-chip-seq-pipeline/lib/python3.6/site-packages/caper/cromwell_rest_api.py", line 144, in get_metadata
    workflows = self.find(workflow_ids, labels)
  File "/cm/shared/apps/anaconda2/4.3.1/envs/encode-chip-seq-pipeline/lib/python3.6/site-packages/caper/cromwell_rest_api.py", line 226, in find
    CromwellRestAPI.ENDPOINT_WORKFLOWS, params=CromwellRestAPI.PARAMS_WORKFLOWS
  File "/cm/shared/apps/anaconda2/4.3.1/envs/encode-chip-seq-pipeline/lib/python3.6/site-packages/caper/cromwell_rest_api.py", line 299, in __request_get
    ) from None
Exception: Failed to connect to Cromwell server. req=GET, url=http://localhost:8000/api/workflows/v1/query

cromwell.out.txt

@leepc12
Copy link
Contributor

leepc12 commented Nov 18, 2020

Check if SGE Parallel Environment openmp exists.

$ qconf -sp

Post caper run's STDOUT.

caper troubleshoot is for server mode only.

@leepc12
Copy link
Contributor

leepc12 commented Nov 18, 2020

Also ignore unsupported warnings in cromwell.out.

@ambeys
Copy link
Author

ambeys commented Nov 19, 2020

$ qconf -sp

yes, openmp is there as PE.

These were the last lines of the caper run. the stdout it says it doesn't exist.
2020-11-18 14:51:19,399|caper.cromwell|INFO| Workflow failed. Auto-troubleshooting...

  • Started troubleshooting workflow: id=2668d99e-f85e-4481-986e-52eb1f5e880a, status=Failed
  • Found failures JSON object.
    [
    {
    "message": "Workflow failed",
    "causedBy": [
    {
    "causedBy": [],
    "message": "Job chip.read_genome_tsv:NA:1 exited with return code -1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details."
    }
    ]
    }
    ]
  • Recursively finding failures in calls (tasks)...

==== NAME=chip.read_genome_tsv, STATUS=Failed, PARENT=
SHARD_IDX=-1, RC=-1, JOB_ID=15841
START=2020-11-18T20:50:59.365Z, END=2020-11-18T20:51:03.723Z
STDOUT=/home/ambey/mnt/Genoma/amedina/ambey/CAPER/caper_out/chip/2668d99e-f85e-4481-986e-52eb1f5e880a/call-read_genome_tsv/execution/stdout
STDERR=/home/ambey/mnt/Genoma/amedina/ambey/CAPER/caper_out/chip/2668d99e-f85e-4481-986e-52eb1f5e880a/call-read_genome_tsv/execution/stderr
2020-11-18 14:51:19,400|caper.nb_subproc_thread|ERROR| Subprocess failed. returncode=1
2020-11-18 14:51:19,400|caper.cli|ERROR| Check stdout/stderr in /mnt/Genoma/amedina/ambey/chip-seq-pipeline2/ExampleRun/cromwell.out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants