📛 Don't be so excited about SDXL, your 8-11 VRAM GPU will have a hard time! #11713
Replies: 27 comments 37 replies
-
What is the sweet spot? 12 GB? 16? 24? 32? 48?? |
Beta Was this translation helpful? Give feedback.
-
I tried it with comfyUI,, takes about 30-60 seconds to generate an image whereas Sd2.1 takes around 20s (on a RTX 2070 Max-Q, 8Gb VRAM) Not a dealbreaker. |
Beta Was this translation helpful? Give feedback.
-
I tried and it fast ten second with my rtx 4090 |
Beta Was this translation helpful? Give feedback.
-
White it's not superfast, on SD Next it takes around 2 minutes on my GTX 1080ti for a 1024x1024 image, which is about the same time it takes to render an image with hires fix on SD 1.5. |
Beta Was this translation helpful? Give feedback.
-
Memory won't be huge issue with tiled vae. |
Beta Was this translation helpful? Give feedback.
-
My 10 VRAM gpu runs it just fine no issues, goes pretty fast too. |
Beta Was this translation helpful? Give feedback.
-
I tried SDXL on a 3060 12GB with no luck, might have to wait till the model is optimised. |
Beta Was this translation helpful? Give feedback.
-
Overall it should be within the same ballpark as SD1.5 as long as you have enough VRAM. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
with sdxl .9 comfyui (i would prefere to use a1111) i'm running a rtx 2060 6gb vram laptop and it takes about 6-8m for a 1080x1080 image with 20 base steps & 15 refiner steps edit: after the first run i get a 1080x1080 image (including the refining) in Prompt executed in 240.34 seconds (4m) |
Beta Was this translation helpful? Give feedback.
-
SDXL and 4060 Ti 16GB release on like, the same day, really makes you think. |
Beta Was this translation helpful? Give feedback.
-
I'm running it on Colab, T4/High RAM using this extension https://github.com/lifeisboringsoprogramming/sd-webui-xldemo-txt2img and it takes about 1 second/iteration. (High RAM is necessary, because the extension has massive RAM leakages, but it's more than fast enough for my needs.) Colab informs me I have 15GB VRAM, SDXL doesn't go above 9GB, same as 1.5 models. Don't see it running on my M2 8GB Mac Mini though… Can't wait to use ControlNet with it. |
Beta Was this translation helpful? Give feedback.
-
I think what delays the proccess is the change of model from base to refiner. I don't know why the refiner model is so heavy. |
Beta Was this translation helpful? Give feedback.
-
Wow i thought you`d need a super cluster to run it. |
Beta Was this translation helpful? Give feedback.
-
Well, I have a 3070ti with 8 gig vram. Worked well so far with all the other models. When I start SDXL it first starts pretty good and even fast, but right before the image is finished ... Cuda out of memory :-( I think 8 gig should work, not sure if there is a good trick to free the vram before starting a1111. Maybe I should try comfyUI |
Beta Was this translation helpful? Give feedback.
-
Idk but it took just 10-15seconds to generate 1024x1024 image for me. RTX 3060ti 8GB |
Beta Was this translation helpful? Give feedback.
-
I have an RTX 4070 Laptop GPU in a top of the line, $4,000 gaming laptop, and SDXL is failing because it's running out of vRAM (I only have 8 GBs of vRAM apparently). I don't mind waiting a while for images to generate, but the memory requirements make SDXL unusable for myself at least. If anyone has suggestions I'd appreciate it. |
Beta Was this translation helpful? Give feedback.
-
On my 3080 I have found that --medvram takes the SDXL times down to 4 minutes from 8 minutes. So SDXL is twice as fast, and SD1.5 takes 10x longer. |
Beta Was this translation helpful? Give feedback.
-
I use a Quadro P4000 8GB and I don't have any issues generating images with SDXL. When I upscale the images I'll go into the 18GB of VRAM territory but since webui-1.5.0 and the latest gaming drivers from nvidia I can use all of my shared memory without crashing. So it's not that big of a deal. Since it only takes 2-3 minutes even when bleeding over into system memory to do the upscale. In the initial txt2img generation I average about 6.7GB for a 1024x1024, 50 euler a steps, with --xformers --no-half-vae --medvram. ~30seconds-1m 30s per image. |
Beta Was this translation helpful? Give feedback.
-
I use a GTX 1060 6Gb, with 64 Gb DDR4 RAM. Besides installing Manager and some custom nodes, like the ones from this video, I ran CounfyUI, loaded SDXL 1.0 base and refiner models, with max 40 steps each, and it worked, about 3-4 minutes/image. And GPU reaching 85ºC. |
Beta Was this translation helpful? Give feedback.
-
It takes between 2-15 minutes on my RTX2060 6GB. Not sure why the time is so variable. The first run is usually the fastest. My 16GB RAM is almost all used as well. My SSD get hits hard periodically. I basically can't do anything while it's running at times. |
Beta Was this translation helpful? Give feedback.
-
2060 Super 8GB, [email protected], 32GB@2866, 2TB NVME, 20TB rotating storage: It takes me 5-30 times longer, per step, than 1.5 models. While models are loading and being transferred, it'll spike my CPU and RAM usage, and start hitting the swap file. That only last about 30-60 seconds, for the large models. It then runs about 10-15% CPU usage, roughly 50-70% (allocated by python) RAM usage, and all of the GPU is used. And that's with the Automatic1111 gui running, previews on, reddit, a google search, an article, and a youtube video, across 2 Firefox windows, on 2 screens; Discord, Messenger, Stardock Deskscapes, and just everything, running with full hardware acceleration. Occasionally, my 1080p Youtube video stutters... MSI Afterburner reports 2100MHz on the GPU, 7175MHz on VRAM, 1043mV, 50 degrees C, 58% fan speed. Contrary to the title of the thread, my 8GB card works fine for SDXL. I'm probably not going to be training any models without some hardware upgrades. |
Beta Was this translation helpful? Give feedback.
-
I am researching a laptop to buy that I intend to use stable diffusion on and it brought me across this forum. You guys clearly are knowledgeable about SD and most the things that you are saying I don't even understand. I had been looking at a laptop with a RTX 4070 w/ 8gb vram. It seems that isn't quite ideal but people are getting it to work. In reading this forum it brought up another question for me. I am wondering what purpose you guys learned all this knowledge about SD? Do you just do it for fun to create cool images, or do you guys use these skills somehow for business/money earning endeavors? Would you have suggestions on ways to use SD for making income? Thanks in advance. |
Beta Was this translation helpful? Give feedback.
-
I use a RTX 4080 and it doesn't even generate anything for me. |
Beta Was this translation helpful? Give feedback.
-
I'm using an RTX 3060 12 gb, on Automatic 1111 1.6.1, and I find SDXL surprisingly fast, even though it's a little slower. Still fast enough to be useable. The only reason I'm sticking to 1.5 is because I like the results better 😊 |
Beta Was this translation helpful? Give feedback.
-
This didn't age well... lol... 3050Ti 4GB Run SDXL in Comfy @ 896x1152 in about 30 seconds. 5 min if I upscale x2 with CR or Iterative Upscalers |
Beta Was this translation helpful? Give feedback.
-
Meh, I got a 6g 1660 SUPER and can render out an image in one min, with multiple loras as well :) Who needs to be spitting out images in seconds, life's too long to be impatient. |
Beta Was this translation helpful? Give feedback.
-
You will need almost the double or even triple of time to generate an image that you do in a few seconds in 1.5, SDXL is designed to run well in high BUFFY GPU's.
If you have an 8-12 VRAM GPU or even a PASCAL one like 1080 TI, you will be waiting forever to the image finishing generating and not even talk if you use the refiner, which is necessary to not get shitty faces.
So i would advise you to stick to 1.5 until optimizations comes out (if that is possible) and could get almost or close times generations that with 1.5.
Beta Was this translation helpful? Give feedback.
All reactions