Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support Inpaint models #511

Merged
merged 19 commits into from
Dec 28, 2024
Merged

support Inpaint models #511

merged 19 commits into from
Dec 28, 2024

Conversation

stduhpf
Copy link
Contributor

@stduhpf stduhpf commented Dec 4, 2024

Adds support for sd1.x inpaint, sd2.x inpaint, sdxl inpaint, and Flux Fill models, as well as "inpainting" (masked img2img) with normal diffusion models.

How to use

Examples:

Using Inpaint model:

sd.exe -M img2img --model ..\models\sd-v1-5-inpainting.safetensors -p "a lovely dog" --cfg-scale 16 --sampling-method euler -t 24 --color --steps 30 --vae-tiling --strength 1 -i '.\inpaint-base.png' --mask '.\inpaint-mask.png'

Input image mask result
image image output

Using normal model (not as consistent):

sd.exe -M img2img --model ..\models\unet\sd3.5_medium-q6_k.gguf --clip_g ..\models\clip\clip_g.q8_0.gguf --clip_l ..\models\clip\clip_l.q8_0.gguf --t5xxl ..\models\clip\t5xxl_q4_k.gguf -p 'a lovely dog sitting patiently on a park bench' --cfg-scale 4.5 --sampling-method euler --steps 30 -t 24 -W 512 -H 512 --seed 0 --color --vae-tiling --strength 1 -i '.\New folder\inpaint-base.png' --mask '.\New folder\inpaint-mask.png'

Input image mask result
image image output

Caveats

  • Implementing DDIM scheduler would probably help get better results, but this is already pretty good with the default one.
  • Mask needs to be the same size as the image, resizing doesn't work.
  • Since is based on img2img, the original image is always added to the random noise distribution, even at strength 1. This can be useful, for example to guide the generation by sketching what you want on the image, but it could also lead the model to regenerate what you want to remove from the image. If you want to avoid that, pre-fill the masked parts with 50% gray in the input image.
  • Parts of the image outside the inpainting area can be slightly modified, especially with heavily quantized models, which tend to add artefacts all over the image.

Fixes #105

@stduhpf stduhpf changed the title SD1.x Inpaint support support Unet Inpaint models Dec 6, 2024
@Amin456789
Copy link

this is great, thank u so much, please add outpainting while u r at it, i think they are the same
with this and ltx video sd cpp will be the best sd out there

@stduhpf
Copy link
Contributor Author

stduhpf commented Dec 6, 2024

@Amin456789 You can set it up to outpaint already, you just need to manually prepare the image and mask (I recommend pre-filling parts to inpaint with 50% grey, since it's based on img2img)

sd.exe -M img2img --model ..\models\sd-v1-5-inpainting.safetensors -p "." --cfg-scale 0 --sampling-method euler -t 24 --color --steps 30 --vae-tiling --strength 1 -i '.\outpaint-base.png' --mask '.\outpaint-mask.png' -W 768

Input image mask result
outpaint-base outpaint-mask output

@Amin456789
Copy link

greaat. thank u
@fszontagh please add this outpainting feature to ur gui too to make it easier than manually doing it if possible

@fszontagh
Copy link
Contributor

greaat. thank u @fszontagh please add this outpainting feature to ur gui too to make it easier than manually doing it if possible

If the pr is merged, i will add to it. This is already in my to-do 🥳

@stduhpf stduhpf changed the title support Unet Inpaint models support Inpaint models Dec 7, 2024
@stduhpf
Copy link
Contributor Author

stduhpf commented Dec 7, 2024

Input image mask result
flux flux-mask flux-fill

with Q3_K diffusion model, --guidance set to 35 (Very high guidance seems to be required to get good results for some reason)

bool is_xl = false;
bool is_flux = false;

#define found_family (is_xl || is_flux)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found_family seems to impact the detecting of sd3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate? In my testing, sd3 seems to be detected reliably...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, with the recommended safetensors loading, it works with sd3. however, if text_encoders.clip_g.* is converted to cond_stage_model.1.*(gguf mode), the following processing will be judged as sdxl, and sd3 will not be recognized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thxCode what about now?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great

Comment on lines +1507 to +1512
if (is_xl) {
if (is_inpaint) {
return VERSION_SDXL_INPAINT;
}
return VERSION_SDXL;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sdxl takes priority over flux.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a problem? In theory it should be impossible for a model to be deteted as SDXL and Flux at the same time, no?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, this should only be encountered after merging all FLUX components into one gguf file following cond_stage_model.* / first_stage_model.* / model.diffusion_model.* format.

Comment on lines 607 to 610

struct ggml_tensor* concat = is_inpaint ? ggml_new_tensor_4d(work_ctx, GGML_TYPE_F32, 8, 8, 5, 1) : NULL;
ggml_set_f32(timesteps, 0);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here seems to impact sd2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of impact?

Copy link

@thxCode thxCode Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, this might be a false positive from my testing(i remember testing on 4090 resulting in sd2 using v-prediction mode instead, but I can't reproduce it on my mac m1 max.).

btw, it looks strange here. timesteps changed from 999 to 0 without any judgment? or do you mean setting concat to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, it looks strange here. timesteps changed from 999 to 0 without any judgment? or do you mean setting concat to 0?

Oh right, that's exactly what I meant to do.

@leejet leejet merged commit 8f4ab9a into leejet:master Dec 28, 2024
9 checks passed
@leejet
Copy link
Owner

leejet commented Dec 28, 2024

Thank you for your contribution.

stduhpf added a commit to stduhpf/stable-diffusion.cpp that referenced this pull request Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Inpainting
5 participants