VRAM differences when selecting GPUs #15871
Closed
noisefloordev
started this conversation in
General
Replies: 1 comment
-
Oops. CUDA_VISIBLE_DEVICES wasn't set correctly (if it was I'd have noticed, since using CUDA_VISIBLE_DEVICES=2 and --device-id=2 doesn't make sense and throws an error). It works after fixing that and removing --device-id=2. Watching memory use during processing with the original configuration, I see a bit of VRAM usage was still on GPU 0, about 260 MB. I'm guessing that some allocation wasn't being affected by --device-id, and maybe having one chunk of data on a different GPU was causing problems later on. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying to figure out an oddity with selecting a GPU. I'm running on a server with three GPUs, and I want to run SD on the third GPU, since that'll make distributing an LLM across the first two and a half GPUs simpler.
Running with "--device-id=2" (and CUDA_VISIBLE_DEVICES=2, though that didn't seem to matter), it does seem to load as expected. nvidia-smi shows all the VRAM on the third GPU. But when I run it with a fairly high VRAM requirement (SDXL, 768x1216, 2x hires fix), it runs out of VRAM, where it doesn't normally.
I'm confused: it works fine on device 0. The GPUs are identical (24gb 3090s), with nothing else happening. nvidia-smi shows 1MB / 24GB on each when nothing is loaded--it shouldn't matter which GPU it's running on and it doesn't seem to be competing with anything. After initial load when not generating, it shows 7.4GB / 24GB in use on cuda:2, exactly the same usage on cuda:0 by default. But something's behaving differently when I put it on the third GPU (or the second, unsurprisingly). On #0, I can run hires fix from 768x1280 up to around 2.5x (higher than I'd ever normally use, just pushing it up to see if I'm skimming on the edge of available VRAM--apparently not).
Just thought I'd see if this rings any bells for anyone before I start digging into things. Anyone with a multi-GPU server run into this before? Any if not, I'm open to troubleshooting suggestions--I'm just watching nvidia-smi for broad memory usage, but what's important is peak memory usage, and what the major allocations are, and I'm not sure how to get that information...
Beta Was this translation helpful? Give feedback.
All reactions