-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark on encoder representation as comparison #1
Comments
I don't fully understand your question, there is just one (1) encoder trained via CPC that can encode the patches. What do you mean by "the unsupervised encoder" and "the full CPC model representation"? Let me summarize what I did just to clarify:
I hope this clarifies a bit your question, please reply if you meant something else. Thanks for dropping by! |
Sorry, I mis-remembered the paper for some reason. I thought the network_encoder or g_enc in the paper was pre trained as a VAE, not that the whole network was trained end to end. I guess I'm interested in the encoder network compared against the features learned by a VAE of similar architecture. |
I see, no problem. In the original paper, they compare CPC with other methods, not VAE though. I have some code for VAE from another project so I might run the experiment you mention if I get some free time. I'll keep you posted. |
I want to ask two questions.
|
Let's focus on equation (3) in section 2.2. It describes how to measure prediction error, and this is what happens:
At this point, we can measure semantic similarity between our predictions and the actual data. Our data contains two kinds of sequences, actually two labels. Positive labels correspond to sorted sequences and negative labels correspond to non-sorted sequences. For the positive labels, we want CPC to predict sorted sequences that produce high similarity scores, in our case a 1. For the negative labels, we want CPC to predict non-sorted sequences that produce low similarity scores, in our case a 0. As they propose in the paper in section 2.3, all we need to do to train CPC is apply binary cross-entropy loss between the similarity scores and the labels, done here. I hope this helps understanding my implementation. Beware that this is my own interpretation of this paper, which might or might not be completely correct. |
Oh, you explained too clearly so that I fully understand. Thank you for your help very much. |
By the way, is the equation (4) in section 2.3 the ‘binary_crossentropy’ in your code? |
Why did you use binary_crossentropy? |
It would be nice to run a MLP on the encoder representation to compare the representation learned by the unsupervised encoder in comparison to the full CPC model representation.
The text was updated successfully, but these errors were encountered: