We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cutadapt command in snakemake templates (for example cemba_data/mapping/Snakefile_template/mc.Snakefile) has a bug
rule trim_r1: input: "fastq/{cell_id}-R1.fq.gz" output: fq=temp("fastq/{cell_id}-R1.trimmed.fq.gz"), stats=temp("fastq/{cell_id}-R1.trimmed.stats.tsv") threads: 2 shell: "cutadapt --report=minimal -a {r1_adapter} {input} 2> {output.stats} | " "cutadapt --report=minimal -O 6 -q 20 -u {r1_left_cut} -u -{r1_right_cut} -m 30 " "-o {output.fq} - >> {output.stats}"
rule trim_r2: input: "fastq/{cell_id}-R2.fq.gz" output: fq=temp("fastq/{cell_id}-R2.trimmed.fq.gz"), stats=temp("fastq/{cell_id}-R2.trimmed.stats.tsv") threads: 2 shell: "cutadapt --report=minimal -a {r2_adapter} {input} 2> {output.stats} | " "cutadapt --report=minimal -O 6 -q 20 -u {r2_left_cut} -u -{r2_right_cut} -m 30 " "-o {output.fq} - >> {output.stats}"
"cutadapt --report=minimal -a {r1_adapter} {input} 2> {output.stats} | "
then output is piped to another instance of cutadapt:
"cutadapt --report=minimal -O 6 -q 20 -u {r2_left_cut} -u -{r2_right_cut} -m 30 "
that has -O 6 option for specifying minimal overlap of adaptor and read
But -O option is not taken into account since adaptor trimming happened already at the first execution of cutadapt
Thus reads are not trimmed from adaptor if there is 6 bases overlap but only if there is 3 bases overlap
start=$(date +%s) echo "Run cutadapt" /opt/conda/bin/cutadapt \ -a R1Adapter=~{r1_adapter} \ -A R2Adapter=~{r2_adapter} \ --report=minimal -O 6 -q 20 \ -u ~{r1_left_cut} \ -u -~{r1_right_cut} \ -U ~{r2_left_cut} \ -U -~{r2_right_cut} \ -Z \ -m ${min_read_length}:${min_read_length} \ --pair-filter 'both' \ -o ${sample_id}-R1_trimmed.fq.gz \ -p ${sample_id}-R2_trimmed.fq.gz \ ${sample_id}-R1_sorted.fq ${sample_id}-R2_sorted.fq \ > ${sample_id}.trimmed.stats.txt end=$(date +%s) elapsed=$((end - start)) echo "Elapsed time to run cutadapt: $elapsed seconds"
The text was updated successfully, but these errors were encountered:
No branches or pull requests
cutadapt command in snakemake templates (for example cemba_data/mapping/Snakefile_template/mc.Snakefile) has a bug
Trim reads
rule trim_r1:
input:
"fastq/{cell_id}-R1.fq.gz"
output:
fq=temp("fastq/{cell_id}-R1.trimmed.fq.gz"),
stats=temp("fastq/{cell_id}-R1.trimmed.stats.tsv")
threads:
2
shell:
"cutadapt --report=minimal -a {r1_adapter} {input} 2> {output.stats} | "
"cutadapt --report=minimal -O 6 -q 20 -u {r1_left_cut} -u -{r1_right_cut} -m 30 "
"-o {output.fq} - >> {output.stats}"
rule trim_r2:
input:
"fastq/{cell_id}-R2.fq.gz"
output:
fq=temp("fastq/{cell_id}-R2.trimmed.fq.gz"),
stats=temp("fastq/{cell_id}-R2.trimmed.stats.tsv")
threads:
2
shell:
"cutadapt --report=minimal -a {r2_adapter} {input} 2> {output.stats} | "
"cutadapt --report=minimal -O 6 -q 20 -u {r2_left_cut} -u -{r2_right_cut} -m 30 "
"-o {output.fq} - >> {output.stats}"
Adaptor trimming is defined here:
"cutadapt --report=minimal -a {r1_adapter} {input} 2> {output.stats} | "
then output is piped to another instance of cutadapt:
"cutadapt --report=minimal -O 6 -q 20 -u {r2_left_cut} -u -{r2_right_cut} -m 30 "
that has -O 6 option for specifying minimal overlap of adaptor and read
But -O option is not taken into account since adaptor trimming happened already at the first execution of cutadapt
Thus reads are not trimmed from adaptor if there is 6 bases overlap but only if there is 3 bases overlap
Broad institute version of this pipeline (https://broadinstitute.github.io/warp/docs/Pipelines/snm3C/README#snm3c-installation) does not have this problem:
The text was updated successfully, but these errors were encountered: