Windows + AMD GPUs (DirectML) #7870
Replies: 86 comments 237 replies
-
But training doesn't work because |
Beta Was this translation helpful? Give feedback.
-
ive had 4 people test this on rx6000 series cards and 2 of them had to copy these https://github.com/crowsonkb/k-diffusion & based on the feedback i recieved there appears to be a memory leak using the above method for the people that had to do it the way. it seems for all 4 people if they do more than a batch size of 1 ram usage appears to max out their gpu or cause issues. the bigger the batch size the more it appears to cause a memory leak with each subsequent image after the first |
Beta Was this translation helpful? Give feedback.
-
set COMMANDLINE_ARGS=--opt-split-attention --medvram --disable-nan-check --autolaunch |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Hi, so I've got it working on a 7900XTX but idk if I done something wrong : Using those 2 repo : Windows 11, 23.2.1 drivers Can't get anything higher than 640x512 (if I'm lucky, more so 512x512) to first run fast, trying higher res give me either runtime error or random error with vram I think (couldn't reproduce), sometime it work but with at least 10s/it Runtime error here : (trying to run 1024x1024)
|
Beta Was this translation helpful? Give feedback.
-
I eventually could get it to run stable diffusion 2.1. RX 580 8gb I opened this issue about problems I encountered and how I got it to work: |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I had this error:
And I fixed it by adding |
Beta Was this translation helpful? Give feedback.
-
Images are generated successfully, but live previews are not displayed. My environment:
|
Beta Was this translation helpful? Give feedback.
-
I have found that when batch size is set to more than 1 initially, it stays faster when batch size is changed to 1 after running the program once |
Beta Was this translation helpful? Give feedback.
-
I got it set up successfully following what @Shad0wyDr3amz listed above. I just came from a Linux Install and it seems to be substantially slower. My environment:
Not sure why it is slower (I'm a bit of a dumbass with this stuff) but it is working. |
Beta Was this translation helpful? Give feedback.
-
there is a notice "Memory optimization for DirectML is disabled. Because this is not Windows platform." don't know why
|
Beta Was this translation helpful? Give feedback.
-
Has anyone encountered a similar error? Please help a newbie |
Beta Was this translation helpful? Give feedback.
-
I've got it working on a 5700xt. Using --no-half --medvram I get around 2.2s/it. If I use --opt-sub-quad-attention --medvram I can get as low as 1.5s/it but will regularly have modules.devices.NansException: A tensor with all NaNs was produced in VAE. This could be because there's no enough precision to represent this picture. Try adding --no-half-vae commandline to fix this. Also, Dreambooth seems to still only be using CPU. I got it to run on GPU once until an Error popped up running about running out of VRAM. Now it just uses CPU. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
This works! I'd been having a ton of problems trying to use the ROCM webui version in linux, so I thought there was just no good way to get webui running on an amd gpu with windows, but this works just fine. Thanks for posting this 5700XT gpu |
Beta Was this translation helpful? Give feedback.
-
Pulled and installed on 3/3/2023 Everything works fine with default settings. Lora worked well. ControlNet seems not working when trying to get Pose info from image, guess I am missing CUDA. One strange issue: Will play around with optimization flags, "Hardware GPU scheduling" and "browser hardware acceleration" to see any performance impact. Also played with Nod.AI, seems good and fast, but requires additional compilation(disk space and CPU time), most importantly, missing features like extensions, Lora, ControlNet, etc. |
Beta Was this translation helpful? Give feedback.
-
Not really used to github, can someone explain how to update? Do I just run git clone https://github.com/lshqqytiger/stable-diffusion-webui-directml && cd stable-diffusion-webui-directml && git submodule init && git submodule update in the stable-diffusion root directory? I've seen someone mention using git pull but when I run it I get "Please commit your changes or stash them before you merge." I had to do a reinstall for an unrelated windows issue and now I can't even do a 512*768 without running out of memory. This is when running with --opt-sub-quad-attention --no-half --precision full --medvram --disable-nan-check. Also, several of my models won't run with control net. Like when using IlluminatiDiffusion I get "RuntimeError: mat1 and mat2 shapes cannot be multiplied (77x1024 and 768x320)" while using canny model and canny preprocessor. |
Beta Was this translation helpful? Give feedback.
-
I'm sorry if formatting is bad, the sites arrangement isn't intuitive. Is it possible to train loras? or any sort of training for that matter(embedding or models)? i have rx5700tx which if i understood correctly is somewhere at the limit of being possible with the vram? (i might be misunderstanding something here) |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
COMMANDLINE_ARGS=--no-half --always-batch-cond-uncond --opt-sub-quad-attention --lowvram --disable-nan-check --autolaunch --theme dark |
Beta Was this translation helpful? Give feedback.
-
Hello everyone. Happy to say SD is working with my PC Build Currently generating at 512 x768 |
Beta Was this translation helpful? Give feedback.
-
I'm getting an error I haven't seen anyone else get nor is Googling it helping me :( . Any ideas? AMD 6700XT. |
Beta Was this translation helpful? Give feedback.
-
I don't know what I'm doing wrong. I've looked for 3 consecutive days now. I try to press the generate button in SD it immediately fails. I check the log and it says (RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'). Easy fix I think and add
Without --no-half:With --no-half: (It takes 5 minutes for one image! thats enough to cook an entire instant cup noodles each time!) |
Beta Was this translation helpful? Give feedback.
-
Windows 10 - Ryzen CPU and AMD Radeon Pro WX 9100 (24Gb Vram) Thanks in advance cause I'm at a loss as to why it would bluescreen. This CPU has an integrated Vega in it, cause this be the issue? OPT: --opt-sub-quad-attention, --disable-nan-check --autolaunch |
Beta Was this translation helpful? Give feedback.
-
How do I check which GPU is being used?
Unfortunately the WX9100 is also Vega based :(, so this might be the issue.
…Sent from my iPhone
On 4 Jun 2023, at 06:47, Seunghoon Lee ***@***.***> wrote:
Could you check which GPU works for stable diffusion? Vega or Radeon Pro?
DirectML has BSOD issue with Vega.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.
|
Beta Was this translation helpful? Give feedback.
-
Thanks though it may be worth trIng to identify which GPU is used (can I just disable the Ryzen iGPU in device manager to test?). This GPU is quite fussy wirh Intel boards... on my 12gen rig it simply won't post :(.
Cause my wx9100 has 16gb (not 24 that was a mistake on my part) of VRAM, so it shouldn't be a small Vram issue like mentioned on that directML thread.
If it still BSOD I should probably give up on that gpu and see if I can resell it but it's going to be tricky regarding its apparent inability to play nice on Intel
Mobos.
Thx.
…Sent from my iPhone
On 4 Jun 2023, at 11:45, Seunghoon Lee ***@***.***> wrote:
Oh, I see. I think you should wait until this issue is fixed. microsoft/DirectML#379
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.
|
Beta Was this translation helpful? Give feedback.
-
Did someone already got Dreambooth working or LoRa training (within a 12GB VRAM GPU)? When I start training I get the error: : Exception training model: ''H:\Dreambooth\01.jpg' is not in list'. And yes nothing is wrong with my images. Alternatives that I could find apart from A1111 are things that need CUDA to work. I would really like to train my own model on my pc. |
Beta Was this translation helpful? Give feedback.
-
I have to say lshqqytiger, Sir, you are a legend. I also have have a 5700XT and all things considered it works a treat, as much as can be expected anyway with AMD. I had tried Linux but it was an epic fail with my card, rocm just didn't work. So thank you, you have now fuelled my new addiction . Anyone having memory issues on older cards try the --lowvram instead of --medvram and see if that helps. Cheers. |
Beta Was this translation helpful? Give feedback.
-
post a comment if you got @lshqqytiger 's fork working with your gpu.
Its good to observe if it works for a variety of gpus.
I did test loras, and control net extension, they work.
commit used lshqqytiger@dce51c5
used the modified modules:
https://github.com/lshqqytiger/k-diffusion-directml
https://github.com/lshqqytiger/stablediffusion-directml
Beta Was this translation helpful? Give feedback.
All reactions