You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the drmaa wrapper to submit/run jobs on an SGE cluster. I'm running into communication exceptions that I've been working to resolve (Related issue: aws/aws-parallelcluster#1592). Has the ruffus team encountered this error? If not, is there a resubmit/retry feature that is ready to use? Even though not explicitly documented, it looks like the run_job function takes a resubmit parameter.
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/shared/amgenesis/helpers.py", line 126, in run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] cmdline.run (options, logger=logger_proxy, multithread = options.jobs, exceptions_terminate_immediately = True)
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/ruffus/cmdline.py", line 834, in run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] **appropriate_options)
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/ruffus/task.py", line 5424, in pipeline_run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] raise job_errors
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] ruffus.ruffus_exceptions.RethrownJobError:
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] Original exception:
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] Exception #1
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] 'drmaa.errors.DrmCommunicationException(code 2: failed receiving gdi request response for mid=65535 (can't send response for this message id - protocol error).)' raised in ...
The text was updated successfully, but these errors were encountered:
Hi ruffus team,
I'm using the drmaa wrapper to submit/run jobs on an SGE cluster. I'm running into communication exceptions that I've been working to resolve (Related issue: aws/aws-parallelcluster#1592). Has the ruffus team encountered this error? If not, is there a resubmit/retry feature that is ready to use? Even though not explicitly documented, it looks like the
run_job
function takes a resubmit parameter.The text was updated successfully, but these errors were encountered: