Weird artifacts with SD controlnet inpaint pipeline #9922
Unanswered
SantiagoJN
asked this question in
Q&A
Replies: 1 comment 4 replies
-
Hi, using the inpainting pipeline with a non-inpaint model is not optimal, so it's always better to use an inpainting model, even more when you're restricting more the generation with a controlnet. The inpainting pipeline is just a way of using a mask with a source image, the inpainting models are finetuned for this special task, so they complement each other. |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I'm using an IP-Adapter to perform image conditioning with StableDiffusion 1.5 (using the checkpoint from RealisticVision5.1), and the controlnet weights from Illyasviel, more specifically the ones for depth and normal maps.
The thing is that, when using these controlnets with the StableDiffusionControlNetPipeline, I manage to get decent results, but as soon as I swap to the StableDiffusionControlNetInpaintPipeline, all I get are weird artifacts. Below I include how I'm using the pipeline, and the results I'm talking about.
ControlNet with depths
Using the conditioning images:
I get the following results:
By calling the pipeline with
images = ip_model.generate(pil_image=stones, image=depth_map, num_samples=4, num_inference_steps=30, guidance_scale=5, seed=42, controlnet_conditioning_scale=0.9)
, where the generate function is that of the IP-Adapter, pil_image=stones inputs the stones' image to be used as image conditioning, and the depth map represents the strawberries' map.
ControlNet with normals
Using a similar setup, conditioning with
I get:
ControlNet inpaint with normals
Now we have more ingredients:
From left to right, they represent stones (the image used for conditioning), the source image, maskedimage (the source after applying the mask), normals, and _mask. With this, I get the following results:
Which pretty much ignore the normal map, even with a scale of 0.9. The code I used for obtaining these results is the following one:
images = ip_model.generate(pil_image=stones, num_samples=4, num_inference_steps=50, control_image=normals, guidance_scale=5.0, clip_skip=1, seed=42, image=masked_image, mask_image=mask, controlnet_conditioning_scale=0.9 )
ControlNet inpaint with depth
Similarly, I use the following setup to condition the diffusion for depth, using an inpainting pipeline:
Getting the results:
Using the code
images = ip_model.generate(pil_image=stones, num_samples=4, num_inference_steps=50, control_image=depth, seed=42, image=masked_image, mask_image=mask, controlnet_conditioning_scale=0.9,
As far as I've seen in the documentation and other discussions, using as base model a checkpoint not specialized in inpainting shouldn't be a problem, since I'm using a pipeline that is already handling that feature.
I'm pretty new, so I wouldn't be surprised if I'm making a very basic mistake, so any suggestion/correction would be greatly appreciated.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions