Requesting Comment from dev (AUTO) regarding Forge #15691
altoiddealer
started this conversation in
Optimization
Replies: 1 comment
-
@AUTOMATIC1111 Any thoughts on porting the memory management of Forge to the 'main' WebUI? Especially as Forge seems to have been abandoned for the past two months. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
stable-diffusion-webui-forge said 3 months ago:
click to expand
If you use common GPU like 8GB vram, you can expect to get about 30~45% speed up in inference speed (it/s), the GPU memory peak (in task manager) will drop about 700MB to 1.3GB, the maximum diffusion resolution (that will not OOM) will increase about 2x to 3x, and the maximum diffusion batch size (that will not OOM) will increase about 4x to 6x.
If you use less powerful GPU like 6GB vram, you can expect to get about 60~75% speed up in inference speed (it/s), the GPU memory peak (in task manager) will drop about 800MB to 1.5GB, the maximum diffusion resolution (that will not OOM) will increase about 3x, and the maximum diffusion batch size (that will not OOM) will increase about 4x.
If you use powerful GPU like 4090 with 24GB vram, you can expect to get about 3~6% speed up in inference speed (it/s), the GPU memory peak (in task manager) will drop about 1GB to 1.4GB, the maximum diffusion resolution (that will not OOM) will increase about 1.6x, and the maximum diffusion batch size (that will not OOM) will increase about 2x.
If you use ControlNet for SDXL, the maximum ControlNet count (that will not OOM) will increase about 2x, the speed with SDXL+ControlNet will speed up about 30~45%.
Another very important change that Forge brings is Unet Patcher. Using Unet Patcher, methods like Self-Attention Guidance, Kohya High Res Fix, FreeU, StyleAlign, Hypertile can all be implemented in about 100 lines of codes.
Thanks to Unet Patcher, many new things are possible now and supported in Forge, including SVD, Z123, masked Ip-adapter, masked controlnet, photomaker, etc.
No need to monkeypatch UNet and conflict other extensions anymore!
Forge also adds a few samplers, including but not limited to DDPM, DDPM Karras, DPM++ 2M Turbo, DPM++ 2M SDE Turbo, LCM Karras, Euler A Turbo, etc. (LCM is already in original webui since 1.7.0).
Finally, Forge promise that we will only do our jobs. Forge will never add unnecessary opinioned changes to the user interface. You are still using 100% Automatic1111 WebUI.
Given that these are all legitimate claims which were made, I expected there would be some sort of response from Automatic1111 to address or acknowledge stable-diffusion-webui's shortcomings in regards to memory efficiency and performance, compared to this implementation.
Alas, here we are months down the road, and I find myself wondering, what the heck happened? It seemed like Forge was dropped like a bomb that would send shockwaves, propagating change and so on. If not, at least some response.
ANYWAY, I know personally that Forge has dramatic backend changes on a fundamental level, so its not something that could feasibly happen for A1111 while maintaining your consistent and substantial update schedule, while resolving Issues, merging PRs - just generally maintaining great development practices.
I am confident that I speak for many users when I say that - I succumbed to the temptations Forge had to offer, and tasted its sweet sweet fruit, and it is indeed juicy. On a whim, I decided to try a quick image on SD after 3 months and right away, first image is OOM just trying to HR Fix at 1.5 scale. It's devastating to me.
SO! I just wanted to reach out and see if the man, the myth, the legend could just write some words about their thoughts regarding Forge and what it means (or "meant" at this rate...) in regards to stable-diffusion-webui. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions