You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am taking some notes on how I ran cgMLST, and I hope you can add documentation for it.
Create database: this took a very long time
# Downloaded the cgMLST scheme from enterobase FTP into Salmonella.cgMLSTv2.enterobase (undocumented)\ls -f1 Salmonella.cgMLSTv2.enterobase/*.fasta | \
grep -v cgMLST_v2_ref.fasta `# ignore already-established reference file`| \
xargs seqtk seq -l 0 `# cat out all the fasta contents and two-line fasta format`| \
perl -lane ' # get the id with '>' and the seq on the next line since it is in a two-line fasta format $id=$F[0]; $seq=<>; chomp($seq); # I don't think this will matter but just avoid any infinite loops by quitting if we see the same sequence
my %seen;
if($seen{$id}++){print STDERR "Already seen $id. Done."; last;}
# Avoid deflines that might be problematic
if($id =~ /[^_>0-9a-zA-Z]/){
print STDERR "Skipping ".$id;
next;
}
print "$id\n$seq";' > enterobase.filtered.fasta
The text was updated successfully, but these errors were encountered:
I also need.
I downloaded the cgMLST scheme for E.coli. When I tried to create the database for 4 days, the machine-time is only 1.2 hour. I found that the machine time nearly no longer increased when it was close to 1.2 hour. So I had to stop the command for creating a database.
I am taking some notes on how I ran cgMLST, and I hope you can add documentation for it.
Create database: this took a very long time
The text was updated successfully, but these errors were encountered: