Skip to content

Commit

Permalink
Merge pull request #44 from USDA-ARS-GBRU/ITSxpressV2_flankingregions…
Browse files Browse the repository at this point in the history
…_trim_gh_issue

Itsxpress v2 flankingregions trim gh issue
  • Loading branch information
seina001 authored Mar 21, 2024
2 parents 0de1542 + bd5c0be commit 60c2803
Show file tree
Hide file tree
Showing 8 changed files with 319 additions and 290 deletions.
10 changes: 9 additions & 1 deletion changelog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
2.0.2 (2024-3-20)
------------------
- Fixed a bug where the 3' end of the ITS region was not being trimmed from both forward and reverse reads if the read extended past the ITS region. This was due to the trimming being done at the start of both forward and reverse reads and not the end of each read. Thus if the read overlaped the opposite end of the ITS read, part of the conserved region would still be found on the ends of the forward and reverse read. This was fixed by trimming to just the ITS region for both forward and reverse reads. This bug did not affect the results of ASV calling with Dada2 becasue Dada2 ignored sequecne beyond the ITS region. This fix will make the output more consistent with expectation.

- Fixed a bug for submodule logging, where submodules were not logging to the main log file. This was fixed by passing the log file to the submodules and having them write to the same log file. This issue was introduced in version 2.0.0.

- Added unit test to confirm that the 3' end of the ITS region is being trimmed from both forward and reverse reads.

2.0.1 (2023-11-07)
------------------
Fix single-end logic bug, which looked for a reverse read file even if single-end reads were provided because the single_end flag wasn't indicated by user.
Expand Down Expand Up @@ -89,4 +97,4 @@ New Features:
-----------------
- Fixed an indexing error causing ITS trimming to be off by 1 base.
- Fixed error when raising file not found exception
- removed old readme
- removed old readme
20 changes: 19 additions & 1 deletion itsxpress/Dedup.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import logging
logger = logging.getLogger(__name__)
import gzip
import pyzstd as zstd
import os
Expand Down Expand Up @@ -97,7 +98,24 @@ def _map_func(ziprecord):
repseq = self.matchdict[record1.id]
start, stop, tlen = itspos.get_position(repseq)
r2start = tlen - stop
return record1[start:], record2[r2start:]
r2end = tlen - start #calculate end of R2
try:
if stop > tlen:
record1_return = record1[start:]
elif stop <= tlen:
record1_return = record1[start:stop]
else:
raise ValueError("An error occurred when trimming the forward read of {}".format(record1.id))
if r2end > tlen:
record2_return = record2[r2start:]
elif r2end <= tlen:
record2_return = record2[r2start:r2end]
else:
raise ValueError("An error occurred when trimming the reverse read of {}".format(record2.id))
except ValueError as e:
logging.exception(e)
raise e
return record1_return, record2_return

def _split_gen(gen):
gen_a, gen_b = tee(gen, 2)
Expand Down
1 change: 1 addition & 0 deletions itsxpress/ITSposition.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import logging
logger = logging.getLogger(__name__)

class ItsPosition:
"""Class for ITS positional information derived from hmmserach domtable files.
Expand Down
1 change: 1 addition & 0 deletions itsxpress/SeqSample.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import os
import logging
logger = logging.getLogger(__name__)
import tempfile
import subprocess

Expand Down
1 change: 1 addition & 0 deletions itsxpress/SeqSamplePaired.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

import subprocess
import logging
logger = logging.getLogger(__name__)
import os
import pyzstd as zstd

Expand Down
2 changes: 1 addition & 1 deletion itsxpress/plugin_setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

plugin = Plugin(
name='itsxpress',
version='2.0.1',
version='2.0.2',
package='itsxpress',
website='https://github.com/USDA-ARS-GBRU/q2_itsxpress '
'ITSxpress: https://github.com/USDA-ARS-GBRU/itsxpress',
Expand Down
2 changes: 1 addition & 1 deletion meta.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{% set name = "itsxpress" %}
{% set version = "2.0.1" %}
{% set version = "2.0.2" %}
{% set file_ext = "tar.gz" %}
{% set hash_type = "sha256" %}
{% set hash_value = "b5797107ee3f21cbaba0b9625aa931741babdee3eeb5a3218a8b8bc9783e2e72" %}
Expand Down
572 changes: 286 additions & 286 deletions tests/test_data/t2_r2.fq

Large diffs are not rendered by default.

0 comments on commit 60c2803

Please sign in to comment.