You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am trying to implement a 2d Control net diffusion model in the latent space. I do following steps:
1> First train an autoencoder (works great, MS_SSIM from comparing, original image and output of autoencoder is 0.97):
Result: (left is original Image and right is produced by autoencoder), Autoencoder code file: autoencoder_config.py
2> Train an unconditional diffusion model in latent space: (Works ok, the quality of generated images is not good but it works) Diffusion model code file: diffusion_config.py
Some generated images:
3> Now train a control Net using the above autoencoder and unconditional diffusion model:
The problem is here. Right Now I am not concerned about image quality but the implementation of condition
Shown below are two pairs of the input conditional mask and the corresponding generated images. The generated image completely disregards the input mask geometry and features.
Do you know what is happening here ? control net code file: controlnet_config.py
Everything was working fine when I was training in image space and not the latent space
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi,
I am trying to implement a 2d Control net diffusion model in the latent space. I do following steps:
1> First train an autoencoder (works great, MS_SSIM from comparing, original image and output of autoencoder is 0.97):
Result: (left is original Image and right is produced by autoencoder), Autoencoder code file: autoencoder_config.py
2> Train an unconditional diffusion model in latent space: (Works ok, the quality of generated images is not good but it works) Diffusion model code file: diffusion_config.py
Some generated images:
3> Now train a control Net using the above autoencoder and unconditional diffusion model:
The problem is here. Right Now I am not concerned about image quality but the implementation of condition
Shown below are two pairs of the input conditional mask and the corresponding generated images. The generated image completely disregards the input mask geometry and features.
Do you know what is happening here ? control net code file: controlnet_config.py
Everything was working fine when I was training in image space and not the latent space
Beta Was this translation helpful? Give feedback.
All reactions