[Tester Needed] Improve SD performance by disabling Hardware GPU scheduling #3889
Replies: 22 comments 14 replies
-
Beta Was this translation helpful? Give feedback.
-
I had tested this previously. IIRC moved from 22it/s to 28it/s on a 4090 |
Beta Was this translation helpful? Give feedback.
-
Moved my 4090 from 14.69it/s to 18.98it/s Thanks for the suggestion! It's particularly noticeable at larger resolutions. |
Beta Was this translation helpful? Give feedback.
-
I have faster generation with it on (RTX 3060 12 GB), but not much difference |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Almost doubled my speed at 768*768, using NAI animefinal pruned, rtx 3080ti |
Beta Was this translation helpful? Give feedback.
-
A bit of tech explanation: The theoretically expected observation is that, the stronger your CPU (esp. single threaded performance), the bigger of benefit you should see with SD speed if you turn hardware scheduling OFF, because if it is OFF, your CPU has to be fast enough to keep up with your GPU in allocating it tasks. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Can confirm. Windows 10 21H2, i7-6700, 32gb ddr4, 3060ti FE. Batch count: 4
About a 10% improvement edit: attached the wrong screenshot |
Beta Was this translation helpful? Give feedback.
-
I see about a 5% speed loss turning it off, but that is probably because my CPU is pretty weak/old (Intel i7-920 2.66 GHz). My graphics card, on the other hand, is a RTX3060 12GB, so probably better for me to keep hardware GPU scheduling turned on so my GPU can do all the work. |
Beta Was this translation helpful? Give feedback.
-
I got about a 30-35% improvement, I have a Ryzen 7 5800X and a RTX 4090. Many thanks! |
Beta Was this translation helpful? Give feedback.
-
Brought me from ~17it/s to ~22it/s Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi! I am new to this stuff i have a 1060 3Gb variant and was wondering if stable diffusion can run on it? I saw that at minimum you can run it on a 4gb gpu like a 1050ti. So i was wondering if i can run it too after some optimizations? |
Beta Was this translation helpful? Give feedback.
-
I'm on a 1660 Super so it's slow, but it did give me a 19 second improvement over 3 minutes, so ~ +10%, pretty nice. However, using
(specifically --no-half ) gave my a much better improvement (however, at the cost of being able to render bigger then 512x512 images) Still for tweaking/experimenting it's great! |
Beta Was this translation helpful? Give feedback.
-
I used the systeminfo extension to benchmark, seemed like there was a slight decrease with this setting turned off. I am running on an RTX 3080 10GB and i7-10700K, xformers on, no --opt-channelslast (I also found that channelslast seemed to decrease it/s some). |
Beta Was this translation helpful? Give feedback.
-
It has a improvement for me, from 10s/it (Yes, 10 seconds per iteration) to 4 to 9s/it, im running with NVidia MX330 2GB and making 512x1024 |
Beta Was this translation helpful? Give feedback.
-
Windows 11, i9-13900K + RTX4090. Disabling this option yielded ~ 4it/s (from ~31 to 35 it/s) |
Beta Was this translation helpful? Give feedback.
-
I'm on Windows 11 with an RTX 2070 Max-Q. I did 2 gens (x768 on a 2.1 custom model) one with Hardware GPU Scheduling and one without. No noticeable performance increase, both gens took 19s with about 2 it/s |
Beta Was this translation helpful? Give feedback.
-
I have AMD 3900X CPU, 64 GB RAM, ASUS RTX 3070 running Windows 11 Version 10.0.22631 Build 22631 GPU Scheduler Disabled **GPU Scheduler Enabled ** So definitely disabling it lead to drop in performance |
Beta Was this translation helpful? Give feedback.
-
Hello, is this still a thing? I was googling for stuff I could tweak to speed up generation and came across this but this post is quite old. I tested this twice with 512x512 and 1080x1080 images and saw a... 1%? increase in speed in both cases. It was so small that it could be anything, I had to end a session that was several days old when I restarted the computer, so it could be just that. My hardware is an 12400f, rtx3060 12GB and I'm on win11. Also, I installed A1111 just a couple days ago, using the automated-pls-do-everything-for-me installer for windows idiots who don't know anything about computers (even after using them for damn nearly 3 decades, what). |
Beta Was this translation helpful? Give feedback.
-
Hi, I have the same problem. I have a 12Gb RTX 4070 OC but I can't create because it tells me I have no memory. Not being an expert in programming, could you explain to me very easily how to remedy this problem? Photos are also welcome |
Beta Was this translation helpful? Give feedback.
-
I did a little testing SD generation details: Results: Conclusion: Laptop: |
Beta Was this translation helpful? Give feedback.
-
Hello,
I have found this few days ago and seem no one / wiki write about this.
By disabling Hardware-accelerated GPU scheduling on Windows Settings, This improve about 10-50%
of SD performance, Affect both training and image generation. (with xformer enabled on my settings)
I have tested on NVIDIA RTX 3090, Image generating went from 11it/s to 17-18it/s
Hypernetwork training 512x512 went from 3.1it/s to 3.9-4.1it/s
On another topic confirm this also improve performance on 3080 Ti
#2977 (reply in thread)
*PS: Disable this option require to restart PC, this may drop gaming performance abit but I not feel when playing games.
also, Tech explain needed why Hardware-accelerated GPU scheduling settings
affect the SD performance for more research. Thank you.
Beta Was this translation helpful? Give feedback.
All reactions