Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meaningful result? #5

Open
liuchenxjtu opened this issue Jan 20, 2016 · 15 comments
Open

meaningful result? #5

liuchenxjtu opened this issue Jan 20, 2016 · 15 comments

Comments

@liuchenxjtu
Copy link

Hi nicolas,
first really thanks for your work. when I run your code, I cannot get meaningful results, all I got is like

NFO:lib.nn_model.train:[why ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[who ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[yeah ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[what is it ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[why not ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[really ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[huh ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[yes ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[what ' s that ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what are you doing ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what are you talking about ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what happened ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[hello ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[where ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[how ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[excuse me ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[who are you ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what do you want ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what ' s wrong ?] -> [i ' . . $$$ .

or

NFO:lib.nn_model.train:[what are you talking about ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[what happened ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[hello ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[where ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[how ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[excuse me ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[who are you ?] -> [i ' . . . . . . . . , , , , , ,]

could you sharing your opinion with me? really appreciate

@nextdawn
Copy link

@liuchenxjtu how many iterations have you finished when you trained the model?

@liuchenxjtu
Copy link
Author

Thanks for your reply. About 20. It is very slow in my machine. How many you suggest? Do u have some sample results for different iterations?

On 2016/01/20, at 21:53, nextdawn [email protected] wrote:

@liuchenxjtu how many iterations have you finished when you trained the model?


Reply to this email directly or view it on GitHub.

@nicolas-ivanov
Copy link
Owner

Guys, I got the similar lame results yesterday...
My guess is that there are some foundational problems in this approach:

  • Since word2vec vectors are used for words representations and the model returns an approximate vector for every next word, this error is accumulated from one word to another and thus starting from the third word the model fails to predict anything meaningful...
    This problem might be overcome if we replace our approximate word2vec vector every thimestamp with a "correct" vector, i.e. the one that corresponds to an actual word from the dictionary. Does it make sence?
    However you need to dig into seq2seq code to do that. @farizrahman4u could be quite helpful here.
  • The second problem relates to word sampling: even if you manage to solve the aforementioned issue, in case you stick to using argmax() for picking the most probable word every time stamps, the answers gonna be too simple and not interesting, like:
are you a human?            -- no .
are you a robot or human?   -- no .
are you a robot?            -- no .
are you better than siri?       -- yes .
are you here ?              -- yes .
are you human?          -- no .
are you really better than siri?    -- yes .
are you there               -- you ' re not going to be
are you there?!?!           -- yes .

Not to mislead you: these results were achieved on a different seq2seq architecture, based on tensorflow.

Sampling with temperature could be used in order to diversify the output results, however that's again should be done inside seq2seq library.

@farizrahman4u
Copy link

@nicolas-ivanov Did you try the other models? Seq2seq, Seq2seq with peek, Attention Seq2seq etc?

@farizrahman4u
Copy link

I recently tested attention seq2seq on the babi dataset and it worked (100% val acc).

@nicolas-ivanov
Copy link
Owner

@farizrahman4u not yet, I'll set the experiment with Attention Seq2seq now.
Meanwhile could you please post the link here to your dataset? And some results example.

@farizrahman4u
Copy link

The standard babi dataset from facebook (used by keras in examples). I did it using a slightly different layer but the idea is almost as same as attention seq2seq. I will be posting the code in a few days as I have not tested on all the babi tasks yet.

@liveabstract
Copy link

Hello @farizrahman4u , I tried using attention seq2seq model, but got ShapeMismatch error.
This error doesn't occur while using SimpleSeq2Seq model. Is there anything that I missing?

@farizrahman4u
Copy link

Please post your code.

@liveabstract
Copy link

@farizrahman4u : Following code is from model.py file, i haven't changed much apart from the model name :

import os.path

from keras.models import Sequential
from seq2seq.models import AttentionSeq2seq
from seq2seq.models import SimpleSeq2seq
from seq2seq.models import Seq2seq

from configs.config import TOKEN_REPRESENTATION_SIZE, HIDDEN_LAYER_DIMENSION, SAMPLES_BATCH_SIZE, \
    INPUT_SEQUENCE_LENGTH, ANSWER_MAX_TOKEN_LENGTH, NN_MODEL_PATH
from utils.utils import get_logger

_logger = get_logger(__name__)


def get_nn_model(token_dict_size):
    _logger.info('Initializing NN model with the following params:')
    _logger.info('Input dimension: %s (token vector size)' % TOKEN_REPRESENTATION_SIZE)
    _logger.info('Hidden dimension: %s' % HIDDEN_LAYER_DIMENSION)
    _logger.info('Output dimension: %s (token dict size)' % token_dict_size)
    _logger.info('Input seq length: %s ' % INPUT_SEQUENCE_LENGTH)
    _logger.info('Output seq length: %s ' % ANSWER_MAX_TOKEN_LENGTH)
    _logger.info('Batch size: %s' % SAMPLES_BATCH_SIZE)

    model = Sequential()
    seq2seq = SimpleSeq2seq(
        input_dim=TOKEN_REPRESENTATION_SIZE,
        input_length=INPUT_SEQUENCE_LENGTH,
        hidden_dim=HIDDEN_LAYER_DIMENSION,
        output_dim=token_dict_size,
        output_length=ANSWER_MAX_TOKEN_LENGTH,
        depth=3
    )

    model.add(seq2seq)
    model.compile(loss='mse', optimizer='rmsprop')

    model.save_weights(NN_MODEL_PATH)

    # use previously saved model if it exists
    _logger.info('Looking for a model %s' % NN_MODEL_PATH)

    if os.path.isfile(NN_MODEL_PATH):
        _logger.info('Loading previously calculated weights...')
        model.load_weights(NN_MODEL_PATH)

    _logger.info('Model is built')
    return model

@tilneyyang
Copy link

Hi @nicolas-ivanov, you mentioned that 'the bad results were based on tensorflow', what are the datasets and other settings? what is the inital perplexity and the converge perplexity on both training set and validation set? I am trying to adapt the translation model example from tensorflow to train a chatbot, is it possible for you to give some details on these? Thanks.

@changukshin
Copy link

It was maybe due to the lack of learning iteration or lack of data size.

@KevinYuk
Copy link

KevinYuk commented Oct 10, 2016

Hi,
Is there anyone who could successfully run this project?

Firstly, when I run this project, I met the log below:

Epoch 1/1
32/32 [==============================] - 0s - loss: nan
Epoch 1/1
32/32 [==============================] - 0s - loss: nan
Epoch 1/1
32/32 [==============================] - 0s - loss: nan
Epoch 1/1
32/32 [==============================] - 0s - loss: nan
Epoch 1/1
32/32 [==============================] - 0s - loss: nanINFO:lib.nn_model.train:[Hi!] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[Hi] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[why ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[who ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[yeah ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what is it ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[why not ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[really ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[huh ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[yes ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what ' s that ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what are you doing ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what are you talking about ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what happened ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[hello ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[where ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[how ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[excuse me ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[who are you ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what do you want ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what ' s wrong ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[so ?] -> [raining raining raining raining raining raining]

Secondly, I change the model code from SimpleSeq2seq to AttentionSeq2seq.
Found a little difference, it print time now. But still wrong.
Epoch 1/1
32/32 [==============================] - 3s - loss: nan
Epoch 1/1
32/32 [==============================] - 3s - loss: nan
Epoch 1/1
32/32 [==============================] - 3s - loss: nan
Epoch 1/1
32/32 [==============================] - 3s - loss: nan
Epoch 1/1
32/32 [==============================] - 3s - loss: nanINFO:lib.nn_model.train:[Hi!] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[Hi] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[why ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[who ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[yeah ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[what is it ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[why not ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[really ?] -> [raining raining raining raining raining raining]
INFO:lib.nn_model.train:[huh ?] -> [raining raining raining raining raining raining]

Thanks a lot.

@lijuncheng16
Copy link

@KevinYuk I got the same "raining" result! Do you have any insight?

@changukshin
Copy link

Just in my opinion, repeating same words means 'not yet fitted'.
And it said 'loss: nan'. It means something is not good.(very high loss or ...)
please re-consider to set your hyperparameters(could you give your hyperparameters?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants