-
Notifications
You must be signed in to change notification settings - Fork 27.2k
Optimum SDXL Usage
ClashSAN edited this page Sep 20, 2023
·
6 revisions
Here's a quick listing of things to tune for your setup:
- Nvidia (12gb+)
--xformers
- Nvidia (8gb)
--medvram-sdxl --xformers
- Nvidia (4gb)
--lowvram --xformers
- AMD (4gb)
--lowvram --opt-sub-quad-attention
+ TAESD in settings
Both rocm and directml will generate at least 1024x1024 pictures at fp16. However, at full precision, the model fails to load (into 4gb). If your card needs --no-half, try enabling --upcast-sampling instead.
-
(Windows) Downgrade Nvidia drivers to 531 or lower. New drivers cause extreme slowdowns on Windows when generating large images towards your card's maximum vram.
This important issue is discussed here and in #11063.
Symptoms:- You see Shared GPU memory usage filling up in Task Manager
- Your generations that usually take 1-2 min, take 7+ min
- low vram cards are generating very slowly
-
Add a pagefile to prevent failure loading weights due to low RAM.
-
(Linux) install
tcmalloc
, greatly reducing RAM usage:sudo apt install --no-install-recommends google-perftools
(#10117). -
Use an SSD for faster load time, especially if a pagefile is required.
- Use sdxl-vae-fp16-fix, a VAE that will not need to run in fp32, for increased speed and less VRAM usage.
- Use TAESD; a VAE that uses drastically less vram at the cost of some quality.
This is the Stable Diffusion web UI wiki. Wiki Home
Setup
- Install and run on NVidia GPUs
- Install and run on AMD GPUs
- Install and run on Apple Silicon
- Install and run on Intel Silicon (external wiki page)
- Install and run via container (i.e. Docker)
- Run via online services
Reproducing images / troubleshooting
Usage
- Features
- Command Line Arguments and Settings
- Optimizations
- Custom Filename Name and Subdirectory
- Change model folder location e.g. external disk
- User Interface Customizations
- Guides and Tutorials
Developers