-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to reproduce results even after doing the pre-training step for each dataset. #8
Comments
Thanks for your interest in our paper. |
Thanks very much for @hadifar. And I'm sorry that I am late to reply. Your advice on pre-training autoencoder is correct. The pre-training autoencoder is important. I use your model for Stackoverflow and my own pre-trainning autoencoder for Stackoverflow which both get nice results just as your paper. But my problem of reproducing results is for another two datasets, Search Snippets and Biomedical Dataset. In your repo ,there are not pre-training autoencoder for above two datasets. So I use your model and data from xu2017(https://github.com/jacoxu/STC2/tree/master/dataset) just as your paper to get a pre-training autoencoder model. And then I get worse results what I described by using the pre-training model. Now I wonder if the hyperparameter is not proper or some other reasons. Can you give me some advice on the results for the other two datasets? If I express unclearly or the experiments have other settings , is there an email address so that I can contact you? And my email address is zhangkai2020c@iscas.ac.cn. I am trying some things for short text clustering based on your work, and look forward to communicate with you. |
I used the same code and did pre-training for each dataset properly. Also, ran the code for 5 steps and took mean and std as mentioned in the paper.
Unable to reproduce results for Search Snippets and Biomedical Dataset.
The text was updated successfully, but these errors were encountered: