v3.3.3
-
Fixed warning related with BLASTp
--seqidlist
parameter. For BLAST>=2.9, the TXT file with the sequence IDs is converted to binary format withblastdb_aliastool
. -
The
Bio.Application
modules are deprecated and might be removed from future Biopython versions. Modified the function that calls MAFFT so that it uses the subprocess module instead ofBio.Align.Applications.MafftCommandline
. Changed the Biopython version requirement to >=1.79. -
Added a
pyproject.toml
configuration file and simplified the instructions insetup.py
. The use ofsetup.py
as a command line tool is deprecated and thepyproject.toml
configuration file allows to install and build packages through the recommended method. -
Updated the Dockerfile to install chewBBACA with
python3 -m pip install .
instead of the deprecatedpython setup.py install
command. -
Removed FASTA header integer conversion before running BLASTp. This was done to avoid a warning from BLAST related to sequence header length exceeding 50 characters.
-
The seqids and coordinates of the CDSs closest to contig tips are stored in a dictionary during gene prediction to simplify LOTSC and PLOT5/3 determination (in many cases this reduces runtime by ~20%).
-
Limited the number of values stored in memory while creating the
results_contigsInfo.tsv
andresults_alleles.tsv
output files to reduce memory usage. -
Adding data to the FASTA and TSV files for the missing classes per locus instead of storing the complete per input data to reduce memory usage.
-
The data for novel alleles is saved to files to reduce memory usage.
-
Fixed the in-frame stop codon count values displayed in the reports created by the SchemaEvaluator module.
-
The
UniprotFinder
module now exits cleanly if the output directory already exists. -
Improved info printed to the stdout by the CreateSchema and AlleleCall modules, added comments, and changed variable names to better match data being stored.