-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there anyone who can't generate images correctly? #122
Comments
@leejet It's my impression or it seems that the CUDA backend is experiencing synchronization issues even from the CLIP model; it tends to happen sometimes. build\bin\Release\sd -m models/kotosmix_v10-f16.gguf -p "beautiful anime girl, white hair, blue eyes, realistic, masterpiece, azur lane, 4k, high quality" -n "bad quality, ugly, face malformed, bad anatomy" --sampling-method dpm++2m --steps 20 -s 424354 with cpu backend (and cuda backend sometimes): Incorrect image since an incorrect (incomplete) embedding is generated, I don't really know. negative embedding invalid. Investigating this synchronization issue is very challenging; it tends to occur sporadically, and replicating it isn't easy. I tried printing the output tensor of the clip, and after 10 repetitions, I identified a change in the values of the embedding. |
google colab T4 cuda, in img2img mode VAE without --vae-tiling always producing solid color image.
|
@FSSRepo Please try colab ========= COMPUTE-SANITIZER
ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
Device 0: Tesla T4, compute capability 7.5
[INFO] stable-diffusion.cpp:5386 - loading model from 'v1-5-pruned-emaonly.safetensors'
[INFO] model.cpp:638 - load v1-5-pruned-emaonly.safetensors using safetensors format
[INFO] stable-diffusion.cpp:5412 - Stable Diffusion 1.x
[INFO] stable-diffusion.cpp:5418 - Stable Diffusion weight type: f32
[INFO] stable-diffusion.cpp:5573 - total memory buffer size = 2731.37MB (clip 470.66MB, unet 2165.24MB, vae 95.47MB)
[INFO] stable-diffusion.cpp:5579 - loading model from 'v1-5-pruned-emaonly.safetensors' completed, taking 2.45s
[INFO] stable-diffusion.cpp:5593 - running in eps-prediction mode
[INFO] stable-diffusion.cpp:6486 - apply_loras completed, taking 0.00s
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80384 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79488 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [77952 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [75264 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [81408 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79360 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80768 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80384 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78976 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78080 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79104 hazards]
=========
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
========= and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78080 hazards] |
@Cyberhan123 Could you send me the CLI commands to perform this test? Your link is not allowing me to access Colab. |
I modified the link and the command is as follows
|
@leejet to fix race condition of softmax in cuda comment the line 6499, this may solve the errors with artifacts when using VAE tiling: while (nth < ncols_x && nth < CUDA_SOFT_MAX_BLOCK_SIZE) nth *= 2; // comment this line |
@leejet I've been testing the SDXL rendering. I did find some issues:
However, when I use the same meta data in SD.app, I get this instead... SDXL does have two text encoders - I'm not sure if this is dealt with in SD.cpp.... (NOTE: as a test for deterministic image generation, I did SD.cpp with SD1.5). Here is the example SD1.5: And this was reproduced in SD.cpp using the same meta data.... |
I'm getting the same horrible results while using SD-Turbo and SDXL-Turbo. |
|
@leejet I think the parameters @YAY-3M-TA3 Y set are wrong. He may have set CFG Scale to 7.0 |
For the SDXL base model, setting the CFG scale to 7 should be fine. In my example above, the CFG scale is also 7 (the default value). |
I'm also seeing this created #187 |
Hey I'm getting artifacts. In my images like green and blue dots everywhere. I think it's a vae problem sd1.5 works fine but when I put resolution to 512x1024 in sdxl I get no artifacts. It's weird but I'm need an option for no vae just for models thar already have a vae baked in. I think it's useing the baked vae and the downloaded vae which is causing artifacts. BTW this works great on my android phone. I'm finally can use sdxl on my phone. All the other webui I've tried crashes on sdxl load or after the first generation plus loras work but not the 2gb lora I run out of ram trying to use Midjourney mimic 1.2 |
I can't figure out the perfect settings my images are not generated correctly I trying hypersdxl lora but only lcm lora works I'm trying to use the least steps possible. |
I am using Vulkan backend on Arch Linux. The nvidia driver version is Model: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/blob/main/sd-v1-4.ckpt Model: https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.safetensors Model: https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/768-v-ema.safetensors Model: https://huggingface.co/stabilityai/stable-diffusion-2-1/blob/main/v2-1_768-ema-pruned.safetensors |
@arenekosreal keep in mind that each model has an image dimension it is optimized for and might not produce good images for lower or higher resolutions. (eg you using 512 for the 768 variant) |
@Green-Sky Oh I have not noticed that |
I found that my model size is 4.9GB while it is 5.21GB on huggingface. I will try to re-download model and check if things change. |
HF will give you a hash you can also compute locally and compare. |
Its sha256 matches what pointer file shows. Huggingface calculates size with GB instead GiB, so my |
I think I found out what the problem is. If I force is_using_v_parameterization_for_sd2() to return |
It looks like the treshold should be
I'm confused because @arenekosreal was able to get "good" results with 768-v-ema.safetensors, but with the current implementation, I can't... |
This is an existing problem I have seen. Some have been solved and some are weird. If you encounter it, please be patient and click in and check the comments. If you encounter something similar, please leave a message below.
Related issues:
The text was updated successfully, but these errors were encountered: