Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training the RDP classifier -c option #23

Open
ddavis3739 opened this issue Oct 26, 2018 · 0 comments
Open

Training the RDP classifier -c option #23

ddavis3739 opened this issue Oct 26, 2018 · 0 comments

Comments

@ddavis3739
Copy link

ddavis3739 commented Oct 26, 2018

RDPstaff,

I am trying to retrain the RDP classifier and have an issue with the -c option. I have already prepped my seq and tax files (end of email) and trained RDP against them.

It output 4 files (below), but none of them is the properties file.

bergeyTrainingTree.xml logWordPrior.txt
genus_wordConditionalProbList.txt wordConditionalProbIndexArr.txt

Do I need to include the -c file to get this? If so, there is no information anywhere on how to generate it that I can find so I was hoping can help. According to the README, "It should at least three columns: name, rank and mean for the lowest rank taxon to be trained". What do you mean by mean in the context of this file? Furthermore, how should I go about generating the whole file?

SEQ FILE

AB353770|AB353770.1.1740_U Root;Eukaryota;Alveolata;Dinoflagellata;Dinophyceae;Peridiniales;Kryptoperidiniaceae;Unruhdinium
ATGCTTGTCTCAAAGATTAAGCCATGCATGTCTCAGTATAAGCTTTTACATGGCGAAACTGCGAATGGCTCATTAAAACAGTTACAGTTTATTTGAAG (cont.)

TAX FILE

0*Root*-1*0*rootrank
1*Eukaryota*0*1*domain
2*Alveolata*1*2*supergroup
3*Dinoflagellata*2*3*division
4*Dinophyceae*3*4*class
5*Peridiniales*4*5*order

Thanks for the help

-Andrew Davis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant