You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I’m struggling to reproduce the work. However, when I start training following the process in this repository, the loss decreases rapidly and it seems to be approaching convergence. Despite this, the model fails to reconstruct images. Does this make sense?
The text was updated successfully, but these errors were encountered:
based on the information you provided with the screenshot:
Are you using images of size 128? The VQGAN provided is not robust for images below 256.
Despite the loss dropping quickly, it seems you are showing results after only 7,500 iterations. The model needs many more updates to generate images with good quality. If the loss drops quickly in the beginning, it's mainly because the model first learns to copy-paste the unmasked tokens. Of course, I don't know the other hyperparameters you are using, but factors like batch size, learning rate or model size can drastically influence the training.
Thanks for your reply. Indeed, the image size is set to 128 for fast training. I will follow the technique report you released and make another attempt. Thanks again!
Hi, I’m struggling to reproduce the work. However, when I start training following the process in this repository, the loss decreases rapidly and it seems to be approaching convergence. Despite this, the model fails to reconstruct images. Does this make sense?
The text was updated successfully, but these errors were encountered: