Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sanity check to ensure that the query sequences are contained the query genome proteins (search.py) #60

Open
LeeBergstrand opened this issue Sep 28, 2021 · 2 comments

Comments

@LeeBergstrand
Copy link
Owner

Problem Description

If a user collects their pathway proteins and their query organisms proteins from different sources, for example, Uniport and Genbank, then BackBlast will give blank results because the two files use different accession systems. The query pathway file and query organism proteins have to use the same accessions.

Problem Solution

  1. Scan the query organism proteins for the pathway proteins by accession and display an error message if they are not found.

OR

  1. Replace the usage of the pathway query file with a file containing a list of accessions from the query subject file. Automatically use the pathway accession list file to extract a pathway query file out of the query organism protein file as a temp file.
@LeeBergstrand LeeBergstrand changed the title Add sanity check to ensure that the query sequences are contained the query genome proteins Add sanity check to ensure that the query sequences are contained the query genome proteins (search.py) Sep 28, 2021
@jmtsuji
Copy link
Collaborator

jmtsuji commented Oct 8, 2021

@LeeBergstrand I personally like the idea of option #2 -- it seems simpler to me. What do you think?

@LeeBergstrand
Copy link
Owner Author

@jmtsuji #2 is probably a good optimization. Let's go with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants