-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training on nlp4j-ner model #25
Comments
Sorry for the late reply. The sample files would not train any good model since they are tiny. You should get the OntoNotes data from LDC and use the entire dataset to train a meaningful model: https://catalog.ldc.upenn.edu/LDC2013T19 Please let me know if you have trouble extracting NER tags from the original OntoNotes data once you get it. Thanks. |
OntoNotes does not come with the format that you need. I actually made the conversion script available so please take a look at this page: https://github.com/emorynlp/ddr/blob/master/md/conversion.md#merge Please let me know if you have more questions. Thanks. |
Is my configuration files are correct or should i make any changes???? @jdchoi77 and team |
Any updates? |
Are you trying to decode or train? These errors are coming from the decoder. If you are trying to decode, could you send me your configuration file, input file, and command you ran? |
The sample file is there only for demo and is too small to be used for training. You should feed in your own data to train NER; you can obtain a large corpus from LDC for free: |
Thank you @jdchoi77 |
Hello , the comment line status is, sh /home/appassembler/bin/nlptrain -mode ner -c /home/config_ner_train.xml -t /home/Output.stv -d /home/Output1.stv -m /home/NLP4JMODEL/en-sam.xz Loading ambiguity classes
|
Hi |
Hi,
I need to add some more dataset to pre-existing model(en-ner.xz), As it is not possible in emory nlp4j now i have trained my own model (en-sam.xz) using the files below!!
sam.zip
i have used the command to train model
java edu.emory.mathcs.nlp.bin.NLPTrain -mode ner -c home/config-train-sample.xml -t /home/sample-trn.tsv -d /home/sample-dev.tsv -m /home/en-sam.xz
New Model was created.
i need to know whether i have used correct files while training?
please help me how can i add this new model(en-sam.xz) along with en-ner.xz using config-decode-en.xml?
i need to load this new model in the code(nlp4j/cli/src/main/java/edu/emory/mathcs/nlp/bin/NLPDemo.java) and test it.
@jdchoi77 and team
Thanks in advance.
The text was updated successfully, but these errors were encountered: