get wrong result after trained with own dataset #81

dagongji10 · 2020-06-18T08:35:55Z

@zhang0jhon
I use this repos to train on handwritten-chinese dataset, and final loss is about 1.0. But when I use test.py to do inference, the result seems wrong.

1、My dataset sample is like :

I label it with a json file, and use dataset.py to create xxx_datasets.npy as annotation file, use train.py to train!

2、I have 1 GPU, so I set batch_size=5, steps_per_epoch=1376. After 100 epochs, the model loss is :

3、I use test.py to do inference with a picture from dataset, but got wrong result, the outputs are always the same value("UNK"):

Is that prove AttentionOCR not useful with handwritten-chinese?
What can I do to make it output normal result?

The text was updated successfully, but these errors were encountered:

JianYang93 · 2020-06-18T11:38:39Z

Hi,

This is indeed interesting. I may be wrong but I cannot think of any reason why this model cannot be used in handwritten-chinese. You may want to test the pretrained model on your dataset and get a feeling how it performs.

From what I understand you are simply using the recognition model. Your input images are already well-cropped text regions so you don't need the detection model. Did you resize and pad the images to 299 * 299 before feeding into the recognition model?

I do not know the answer of your question but there three things I want to share:

(1) Your training set is quite small. You have batch_size=5 and steps_per_epoch=1376 so I assume you have around 6000+ images? I used LSVT, ReCTS, ArT and ICDAR datasets to train and I had around 65000 images. Moreover each image may have multiple text regions. This model is quite complex so small datasets may not generalize well.

(2) I set up a validation set while training. I got a training loss of around 1.2 in my case but the validation loss is very high and unstable. I tested my model and the performance is very poor, both on training set and validation set. Therefore I think a loss of 1 may not be low enough. You may want to set aside a validation set too.

(3) 'UNK' is the 0th character in the label_dict dictionary. It may be that your model is simply predicting equal probabilities for every possible character, and the the argmax operation will return the first element index 0. Can you check those probabilities?

Please let us know if you have any more insights into the problem.

dagongji10 · 2020-06-19T07:00:20Z

@JianYang93
Thanks for your reply!
I have test the pretrained model in docker use my dataset and it performs like:

So I try to retrain the recognition model. As you say, the input images are already well-cropped text regions, resize and padding also accroding to text_dataflow.py(the function get_batch_train_dataflow do this work). But config.py set input image size=256*256 not 299*299.

1、My dataset has 6883 images, each image have only one text region(cropped). I plan to merge ArT to handwritten-chinese dataset, check whether it can be useful.

2、I didn't use validation, may I can have a try.

3、I agree with your perspective that the model is simply predicting equal probabilities for every possible character. Because the probabilities are same value -3.4028235e+38.

JianYang93 · 2020-06-19T10:39:31Z

@dagongji10
No problem!
OK so the pretrained model does not work for your hand-written dataset.
It is weird that aftering training the model outputs equal probabilities for all characters. I never used the code train.py. Is it because of non-random initialization?

1456416403 · 2020-09-30T00:59:26Z

hello，i downloaded the 5345.pb model，but the pretrain model‘s performance is readlly bad on hand-wrriten chinese,it seems that something is wrong with the model

xianzhe-741 · 2020-12-15T09:30:08Z

@zhang0jhon
I use this repos to train on handwritten-chinese dataset, and final loss is about 1.0. But when I use test.py to do inference, the result seems wrong.

1、My dataset sample is like :

I label it with a json file, and use dataset.py to create xxx_datasets.npy as annotation file, use train.py to train!

2、I have 1 GPU, so I set batch_size=5, steps_per_epoch=1376. After 100 epochs, the model loss is :

3、I use test.py to do inference with a picture from dataset, but got wrong result, the outputs are always the same value("UNK"):

Is that prove AttentionOCR not useful with handwritten-chinese?
What can I do to make it output normal result?

你好，我使用过程中有两个问题请教一下：

test.py过程中使用作者docker中的模型text_recognition_5435.pb，在_ = tf.import_graph_def(graph_def, name='')时报错 InvalidArgumentError (see above for traceback): The second input must be a scalar, but it has shape [1,33]

2.在train.py时报错
File "/usr/local/lib/python3.5/dist-packages/tensorpack/train/config.py", line 119, in init
assert_type(model, ModelDescBase, 'model')
File "/usr/local/lib/python3.5/dist-packages/tensorpack/train/config.py", line 107, in assert_type
name, tp.name, v.class.name)
AssertionError: model has to be type 'ModelDescBase', but an object of type 'AttentionOCR' found.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get wrong result after trained with own dataset #81

get wrong result after trained with own dataset #81

dagongji10 commented Jun 18, 2020

JianYang93 commented Jun 18, 2020

dagongji10 commented Jun 19, 2020

JianYang93 commented Jun 19, 2020

1456416403 commented Sep 30, 2020

xianzhe-741 commented Dec 15, 2020

get wrong result after trained with own dataset #81

get wrong result after trained with own dataset #81

Comments

dagongji10 commented Jun 18, 2020

JianYang93 commented Jun 18, 2020

dagongji10 commented Jun 19, 2020

JianYang93 commented Jun 19, 2020

1456416403 commented Sep 30, 2020

xianzhe-741 commented Dec 15, 2020