Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge VRAM usage with start/stop #154

Open
slashedstar opened this issue Mar 8, 2024 · 3 comments
Open

Huge VRAM usage with start/stop #154

slashedstar opened this issue Mar 8, 2024 · 3 comments

Comments

@slashedstar
Copy link

image
Is this expected when using the start/stop? I was getting OOM errors and had to change a setting in the nvidia CP to allow the fallback to RAM, which means when I use this the it/s drop by a lot, I go from 6 it/s to 1.5it/s after the lora is stopped/started by the extension (I'm on Forge, b9705c58f66c6fd2c4a0168b26c5cf1fa6c0dde3)

@hako-mikan
Copy link
Owner

Does this issue also occur with the latest version of Forge?
I tested it and did not encounter any problems.

@slashedstar
Copy link
Author

image
Brand new installation, just git cloned, started and installed the extension, the OOM happens with SDXL but not with 1.5, though its still able to complete the image.

With start=10

Moving model(s) has taken 1.59 seconds
 55%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                                                         | 11/20 [00:01<00:01,  6.03it/s]ERROR diffusion_model.output_blocks.0.1.transformer_blocks.2.ff.net.0.proj.weight CUDA out of memory. Tried to allocate 50.00 MiB. GPU 0 has a total capacty of 8.00 GiB of which 0 bytes is free. Of the allocated memory 7.06 GiB is allocated by PyTorch, and 218.22 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
ERROR diffusion_model.output_blocks.0.1.transformer_blocks.3.ff.net.0.proj.weight CUDA out of memory. Tried to allocate 50.00 MiB. GPU 0 has a total capacty of 8.00 GiB of which 0 bytes is free. Of the allocated memory 7.13 GiB is allocated by PyTorch, and 177.43 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
*** Error executing callback cfg_denoiser_callback for E:\blankforge\stable-diffusion-webui-forge\extensions\sd-webui-lora-block-weight\scripts\lora_block_weight.py
    Traceback (most recent call last):
      File "E:\blankforge\stable-diffusion-webui-forge\modules\script_callbacks.py", line 233, in cfg_denoiser_callback
        c.callback(params)
      File "E:\blankforge\stable-diffusion-webui-forge\extensions\sd-webui-lora-block-weight\scripts\lora_block_weight.py", line 455, in denoiser_callback
        shared.sd_model.forge_objects.unet.patch_model()
      File "E:\blankforge\stable-diffusion-webui-forge\ldm_patched\modules\model_patcher.py", line 216, in patch_model
        out_weight = self.calculate_weight(self.patches[key], temp_weight, key).to(weight.dtype)
    torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacty of 8.00 GiB of which 0 bytes is free. Of the allocated memory 7.13 GiB is allocated by PyTorch, and 177.61 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

---
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.48it/s]
To load target model AutoencoderKL██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  5.41it/s]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  1908.81689453125
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  725.2598133087158
Moving model(s) has taken 0.11 seconds
Total progress: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.11it/s]
Total progress: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  5.41it/s]

(this was to generate a single 512x512)

@PANyZHAL
Copy link

PANyZHAL commented Apr 8, 2024

same problem on version: f0.0.17v1.8.0rc-latest-276-g29be1da7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants