Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tadrep detect error #17

Open
elozanoe opened this issue Mar 18, 2024 · 1 comment
Open

tadrep detect error #17

elozanoe opened this issue Mar 18, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@elozanoe
Copy link

Hi! I've been looking at this tool and I think it could be useful for what I want to do. In the file plasmids.fna I have a set of contigs that may be plasmids, and I wanted to use tadrep detect to align the contigs against the PLSDB or RefSeq databases. I have used this command:

tadrep -v -o ./prueba2 detect --genome ./prueba/plasmids.fna --min-contig-coverage 75 --min-contig-identity 75 --min-plasmid-coverage 50 --min-plasmid-identity 75

However, I obtain this message:

ERROR: No data available in /mnt/disk2/data/tfm/prueba2/db.json

Here I paste the tadrep.log file content:

2024-03-18 19:00:24,627 - MainProcess - INFO - MAIN - version 0.9.1
2024-03-18 19:00:24,627 - MainProcess - INFO - MAIN - command line: /home/tfm/miniconda3/envs/tadrep/bin/tadrep -v -o ./prueba2 detect --genome ./prueba/plasmids.fna --min-contig-coverage 75 --min-contig-identity 75 --min-plasmid-coverage 50 --min-plasmid-identity 75
2024-03-18 19:00:24,627 - MainProcess - INFO - CONFIG - threads=32
2024-03-18 19:00:24,627 - MainProcess - INFO - CONFIG - verbose=True
2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - tmp-path=/tmp/tmph8tm2930
2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - output-path=/mnt/disk2/data/tfm/prueba2
2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - prefix=None
2024-03-18 19:00:24,628 - MainProcess - INFO - UTILS - genome-path=/mnt/disk2/data/tfm/prueba/plasmids.fna
2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - summary_path=/mnt/disk2/data/tfm/prueba2/summary.tsv
2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - db_path=/mnt/disk2/data/tfm/prueba2/db.json
2024-03-18 19:00:24,628 - MainProcess - DEBUG - IO - /mnt/disk2/data/tfm/prueba2/db.json NOT existing
2024-03-18 19:00:24,628 - MainProcess - DEBUG - CONFIG - No data in /mnt/disk2/data/tfm/prueba2/db.json

I think there is a mistake in the path of the database. When I try to specify it using --db /mnt/disk2/databases/tadrep_db/plsdb, I obtain "error: unrecognized arguments: --db /mnt/disk2/databases/tadrep_db/plsdb". I've also tried to copy the PLSDB database to the path that is marked in bold, but it still gives an error:

cp /path/to/tadrep_db/plsdb/plsdb.json ./prueba2/db.json

tadrep -v -o ./prueba2 detect --genome ./prueba/plasmids.fna --min-contig-coverage 75 --min-contig-identity 75 --min-plasmid-coverage 50 --min-plasmid-identity 75

Obtaining: Detection and reconstruction started ...
ERROR: No cluster in database /mnt/disk2/data/tfm/prueba2/db.json

Here I paste the new tadrep.log file content:

2024-03-18 19:07:02,203 - MainProcess - INFO - MAIN - version 0.9.1
2024-03-18 19:07:02,203 - MainProcess - INFO - MAIN - command line: /home/tfm/miniconda3/envs/tadrep/bin/tadrep -v -o ./prueba2 detect --genome ./prueba/plasmids.fna --min-contig-coverage 75 --min-contig-identity 75 --min-plasmid-coverage 50 --min-plasmid-identity 75
2024-03-18 19:07:02,203 - MainProcess - INFO - CONFIG - threads=32
2024-03-18 19:07:02,203 - MainProcess - INFO - CONFIG - verbose=True
2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - tmp-path=/tmp/tmp2fygzok5
2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - output-path=/mnt/disk2/data/tfm/prueba2
2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - prefix=None
2024-03-18 19:07:02,204 - MainProcess - INFO - UTILS - genome-path=/mnt/disk2/data/tfm/prueba/plasmids.fna
2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - summary_path=/mnt/disk2/data/tfm/prueba2/summary.tsv
2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - db_path=/mnt/disk2/data/tfm/prueba2/db.json
2024-03-18 19:07:02,204 - MainProcess - DEBUG - IO - /mnt/disk2/data/tfm/prueba2/db.json existing
2024-03-18 19:07:14,238 - MainProcess - INFO - IO - imported json: # sequences=1
2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - min-contig-coverage=0.750
2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - min-contig-identity=0.750
2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - min-plasmid-coverage=0.500
2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - min-plasmid-identity=0.750
2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - gap-sequence-length=10
2024-03-18 19:07:14,239 - MainProcess - INFO - CONFIG - blast-threads=32
2024-03-18 19:07:14,239 - MainProcess - DEBUG - DETECTION - No Clusters in /mnt/disk2/data/tfm/prueba2/db.json!

What am I doing wrong? Do I have to do any previous steps with the plasmids.fna file? Thanks in advance!

@elozanoe elozanoe added the bug Something isn't working label Mar 18, 2024
@biobrad
Copy link

biobrad commented Apr 25, 2024

I concur with op. The instructions are very confusing on how databases are meant to be referenced and used.
I tried having db local, tried having them elsewhere, i also get this error in detect. I thought maybe I needed to do cluster, but cluster gives an error as well:

TaDReP v0.9.1
Options and arguments:
	output: /home/harbj019/genomeresults/kp518plasmids
	prefix: None
	tmp directory: /tmp/tmp_a11drss
	# threads: 24

Clustering started...
ERROR: 
Warning:
Some seqs are too long, please rebuild the program with make parameter MAX_SEQ=new-maximum-length (e.g. make MAX_SEQ=10000000)
Not fatal, but may affect results !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants