Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Script to Download Complete MLST DB #4

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

jwarnn
Copy link

@jwarnn jwarnn commented Oct 11, 2024

As part of the asmngs-hackathon-2024, I was pointed to the fact that the script MLST used to download a db for it uses a perl script that a large xml template and lots of placeholders. This script is used, I believe, in the nf-core module and by the developers of the StaphB docker container. https://github.com/bactopia/bactopia/blob/master/modules/nf-core/mlst/update/main.nf#L27-L40 https://github.com/tseemann/mlst/blob/master/scripts/mlst-download_pub_mlst. This new script recycles some of the code in the two other scripts in the repository and downloads the needed files from the API. Each scheme with "MLST" in its description is downloaded trying to mirror the output of the original perl script.

@jwarnn
Copy link
Author

jwarnn commented Oct 11, 2024

Will ask for a review once the DB downloads and I can run diff to compare the output to the other script.

@jwarnn jwarnn marked this pull request as ready for review October 16, 2024 16:31
@jwarnn
Copy link
Author

jwarnn commented Oct 16, 2024

I ran a diff between the output of https://github.com/tseemann/mlst/blob/master/scripts/mlst-download_pub_mlst and the script that is from this PR request and there are a number of differences shown below. I am not sure of the implications for MLST and/or why these were removed/changed from the API.

Folders only in old script output:
bordetella_3
diphtheria_3
ecoli
ecoli_achtman_4
kingella
klebsiella
listeria_2
mcatarrhalis_achtman_6
mgallisepticum
mgallisepticum_2
mhominis_3
senterica_achtman_2
staphlugdunensis
streptothermophilus
ypseudotuberculosis_achtman_3

Folders only in new script output:
aparagallinarum
blastocystis
blastocystis_2
brucella_2
calbicans
cglabrata
chlamydiales_38
chlamydiales_40
chlamydiales_41
ckrusei
csinensis
ctropicalis
efaecium_2
escherichia
escherichia_2
kseptempunctata
lgarvieae
mbovis
mgenitalium_2
mhominis
mhyosynoviae
oralstrep_2
oralstrep_3
plasmid
plasmid_4
plasmid_5
proteus
salmonella_2
serratia
siniae
smitis
sparasitica
tvaginalis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant