Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: dim(X) must have a positive length #203

Open
YogiOnBioinformatics opened this issue Dec 10, 2020 · 10 comments
Open

BUG: dim(X) must have a positive length #203

YogiOnBioinformatics opened this issue Dec 10, 2020 · 10 comments

Comments

@YogiOnBioinformatics
Copy link

Describe the bug

Task xcor fails with specific issue:

Error in apply(ac, 2, function(x) sum(x * avw)) :
  dim(X) must have a positive length

OS/Platform

  • OS/Platform: Debian GNU/Linux 8 (jessie)
  • Pipeline version: 1.3.6

Input JSON file

{
    "chip.title": "P2L7S6_H3K4me3_ChIP",
    "chip.description": "",
    "chip.pipeline_type": "histone",
    "chip.paired_end": false,
    "chip.ctl_paired_end": false,
    "chip.genome_tsv": "/some_path/hg38/hg38.tsv",
    "chip.fastqs_rep1_R1": [
        "/absolute_path/201007Fra_D20-3991_NA_1.fastq.gz"
    ],
    "chip.ctl_fastqs_rep1_R1": [
        "/absolute_path/201007Fra_D20-3986_NA_1.fastq.gz"
    ]
}

Troubleshooting result

Paste troubleshooting result.

Traceback (most recent call last):
  File "/software/chip-seq-pipeline/src/encode_task_xcor.py", line 156, in <module>
    main()
  File "/software/chip-seq-pipeline/src/encode_task_xcor.py", line 144, in main
    args.chip_seq_type, args.exclusion_range_min, args.exclusion_range_max)
  File "/software/chip-seq-pipeline/src/encode_task_xcor.py", line 105, in xcor
    run_shell_cmd(cmd1)
  File "/software/chip-seq-pipeline/src/encode_lib_common.py", line 319, in run_shell_cmd
    raise Exception(err_str)
Exception: PID=1926106, PGID=1926106, RC=1
STDERR=Loading required package: caTools
Error in apply(ac, 2, function(x) sum(x * avw)) :
  dim(X) must have a positive length
Calls: get.binding.characteristics -> lapply -> FUN -> apply
Execution halted
STDOUT=################
ChIP data: 201007Fra_D20-3991_NA_1.trim_50bp.filt.no_chrM.15M.tagAlign.gz
Control data: NA
strandshift(min): -500
strandshift(step): 5
strandshift(max) 1500
user-defined peak shift NA
exclusion(min): -500
exclusion(max): 100
num parallel nodes: 2
FDR threshold: 0.01
NumPeaks Threshold: NA
Output Directory: .
narrowPeak output file name: NA
regionPeak output file name: NA
Rdata filename: NA
plot pdf filename: 201007Fra_D20-3991_NA_1.trim_50bp.filt.no_chrM.15M.cc.plot.pdf
result filename: 201007Fra_D20-3991_NA_1.trim_50bp.filt.no_chrM.15M.cc.qc
Overwrite files?: TRUE

Decompressing ChIP file
Reading ChIP tagAlign/BAM file 201007Fra_D20-3991_NA_1.trim_50bp.filt.no_chrM.15M.tagAlign.gz
opened /pool/data/cromwell-aals/cromwell-executions/chip/b744d0a2-c764-40a4-a952-fc43d2d0a1ee/call-xcor/shard-0/tmp.2813919d/RtmpwmIuJ2/201007Fra_D20-3991_NA_1.trim_50bp.filt.no_chrM.15M.tagAlign1d63dc4f08563d
done. read 16 fragments
ChIP data read length 40
[1] TRUE
Calculating peak characteristics
ln: failed to access '*.cc.plot.pdf': No such file or directory
ln: failed to access '*.cc.plot.png': No such file or directory
ln: failed to access '*.cc.qc': No such file or directory
ln: failed to access '*.cc.fraglen.txt': No such file or directory

@YogiOnBioinformatics YogiOnBioinformatics changed the title dim(X) must have a positive length BUG: dim(X) must have a positive length Dec 10, 2020
@YogiOnBioinformatics
Copy link
Author

@leepc12 @akundaje hope you both are well! Just wanted to follow up on this!

@YogiOnBioinformatics
Copy link
Author

Super sorry to bother again @akundaje @leepc12 😄
Just wanted to see if you got around to this.

@leepc12
Copy link
Contributor

leepc12 commented Jan 15, 2021

Sorry about late response, can you try with the latest pipeline?

@YogiOnBioinformatics
Copy link
Author

I didn't end up using this sample.
We had other replicates.

If I run into this problem again, I'll try that.

@YogiOnBioinformatics
Copy link
Author

YogiOnBioinformatics commented Mar 8, 2021

@leepc12 @akundaje

Just wanted to bring this back up.
I cannot switch to a new pipeline since we have a large amount of samples analyzed with this pipeline version.

It seems that the bug is in run_spp.R as part of encode_task_xcor.py.

Is there any way to manually calculate the fraglen AND disable the xcor task so that it won't keep bugging.
Even if I manually calculate fraglen, it seems xcor would run and hence, would fail again.

If I CANNOT disable xcor, how can I still manually calculate the fraglen

@leepc12
Copy link
Contributor

leepc12 commented Mar 9, 2021

Define chip.fraglen as an array (for each replicate) and disable xcor with chip.enable_xcor.

{
    "chip.fraglen" : [100, 120],
    "chip.enable_xcor" : false
}

@YogiOnBioinformatics
Copy link
Author

2 questions.

  1. For this pipeline, I remember that there is no enable_xcor option. This is especially since I am talking about v1.3.6. Am I missing something?
  2. How would I calculate fraglen easily? Would macs2 predictd work?

@leepc12
Copy link
Contributor

leepc12 commented Mar 9, 2021

Define chip.fraglen first in your input JSON. You may need to modify chip.wdl.

Please delete these lines
https://github.com/ENCODE-DCC/chip-seq-pipeline2/blob/v1.3.6/chip.wdl#L582-L596
https://github.com/ENCODE-DCC/chip-seq-pipeline2/blob/v1.3.6/chip.wdl#L1146-L1147

Please modify the following line
https://github.com/ENCODE-DCC/chip-seq-pipeline2/blob/v1.3.6/chip.wdl#L601

to else 0

Please let me know if this works.

@YogiOnBioinformatics
Copy link
Author

@leepc12 I figured out the issue I was having.
I did not end up implementing this for the following reason.

The FASTQ data I was analyzing was EXTREMELY small (few KB file size) due to a mistake from a collaborator.

If in the future, I run into this issue with a normal FASTQ file, I will implement this and see if it works.

Thanks so much for your help!
Collaborator mistakes are the worst.

@leepc12
Copy link
Contributor

leepc12 commented Mar 11, 2021

Yep, thanks. Please let me know if the above implementation works or you can close the issue if it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants