The downsampling ratio of VAE in maisi #1841
Unanswered
DopamineLcy
asked this question in
Q&A
Replies: 1 comment
-
We currently do not have 8x8x8 downsampling VAE. Thank you for letting us know that this is desired! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The downsampling ratio of VAE in maisi is 4, i.e. a [128, 128, 128] patch results in a [32, 32, 32] latent.
While 32 ^ 3 = 32768 is very large for a transformer-based model.
Is there any pre-trained VAE with a larger down-sampling ratio like 8 ([128, 128, 128]->[16,16,16]?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions