Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get wrong result after trained with own dataset #81

Open
dagongji10 opened this issue Jun 18, 2020 · 5 comments
Open

get wrong result after trained with own dataset #81

dagongji10 opened this issue Jun 18, 2020 · 5 comments

Comments

@dagongji10
Copy link

@zhang0jhon
I use this repos to train on handwritten-chinese dataset, and final loss is about 1.0. But when I use test.py to do inference, the result seems wrong.

1、My dataset sample is like :
image
image
I label it with a json file, and use dataset.py to create xxx_datasets.npy as annotation file, use train.py to train!

2、I have 1 GPU, so I set batch_size=5, steps_per_epoch=1376. After 100 epochs, the model loss is :
image
image

3、I use test.py to do inference with a picture from dataset, but got wrong result, the outputs are always the same value("UNK"):
image

Is that prove AttentionOCR not useful with handwritten-chinese?
What can I do to make it output normal result?

@JianYang93
Copy link

Hi,

This is indeed interesting. I may be wrong but I cannot think of any reason why this model cannot be used in handwritten-chinese. You may want to test the pretrained model on your dataset and get a feeling how it performs.

From what I understand you are simply using the recognition model. Your input images are already well-cropped text regions so you don't need the detection model. Did you resize and pad the images to 299 * 299 before feeding into the recognition model?

I do not know the answer of your question but there three things I want to share:

(1) Your training set is quite small. You have batch_size=5 and steps_per_epoch=1376 so I assume you have around 6000+ images? I used LSVT, ReCTS, ArT and ICDAR datasets to train and I had around 65000 images. Moreover each image may have multiple text regions. This model is quite complex so small datasets may not generalize well.

(2) I set up a validation set while training. I got a training loss of around 1.2 in my case but the validation loss is very high and unstable. I tested my model and the performance is very poor, both on training set and validation set. Therefore I think a loss of 1 may not be low enough. You may want to set aside a validation set too.

(3) 'UNK' is the 0th character in the label_dict dictionary. It may be that your model is simply predicting equal probabilities for every possible character, and the the argmax operation will return the first element index 0. Can you check those probabilities?

Please let us know if you have any more insights into the problem.

@dagongji10
Copy link
Author

@JianYang93
Thanks for your reply!
I have test the pretrained model in docker use my dataset and it performs like:
image
So I try to retrain the recognition model. As you say, the input images are already well-cropped text regions, resize and padding also accroding to text_dataflow.py(the function get_batch_train_dataflow do this work). But config.py set input image size=256*256 not 299*299.

1、My dataset has 6883 images, each image have only one text region(cropped). I plan to merge ArT to handwritten-chinese dataset, check whether it can be useful.

2、I didn't use validation, may I can have a try.

3、I agree with your perspective that the model is simply predicting equal probabilities for every possible character. Because the probabilities are same value -3.4028235e+38.

@JianYang93
Copy link

@dagongji10
No problem!
OK so the pretrained model does not work for your hand-written dataset.
It is weird that aftering training the model outputs equal probabilities for all characters. I never used the code train.py. Is it because of non-random initialization?

@1456416403
Copy link

hello,i downloaded the 5345.pb model,but the pretrain model‘s performance is readlly bad on hand-wrriten chinese,it seems that something is wrong with the model

@xianzhe-741
Copy link

@zhang0jhon
I use this repos to train on handwritten-chinese dataset, and final loss is about 1.0. But when I use test.py to do inference, the result seems wrong.

1、My dataset sample is like :
image
image
I label it with a json file, and use dataset.py to create xxx_datasets.npy as annotation file, use train.py to train!

2、I have 1 GPU, so I set batch_size=5, steps_per_epoch=1376. After 100 epochs, the model loss is :
image
image

3、I use test.py to do inference with a picture from dataset, but got wrong result, the outputs are always the same value("UNK"):
image

Is that prove AttentionOCR not useful with handwritten-chinese?
What can I do to make it output normal result?

你好,我使用过程中有两个问题请教一下:

  1. test.py过程中使用作者docker中的模型text_recognition_5435.pb,在_ = tf.import_graph_def(graph_def, name='')时报错 InvalidArgumentError (see above for traceback): The second input must be a scalar, but it has shape [1,33]

2.在train.py时报错
File "/usr/local/lib/python3.5/dist-packages/tensorpack/train/config.py", line 119, in init
assert_type(model, ModelDescBase, 'model')
File "/usr/local/lib/python3.5/dist-packages/tensorpack/train/config.py", line 107, in assert_type
name, tp.name, v.class.name)
AssertionError: model has to be type 'ModelDescBase', but an object of type 'AttentionOCR' found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants