SDXL Fooocus Inpaint #9870
Replies: 55 comments
-
they also seem to use fooocus_inpaint_head.pth I'm not quite sure what this will do, I read the code and maybe an additional patch for unet? The inpaint_v26.fooocus.patch is more similar to a lora, and then the first 50% executes base_model + lora, and the last 50% executes base_model. |
Beta Was this translation helpful? Give feedback.
-
Actually it seems more like a controlnet, something more like this one: https://huggingface.co/destitech/controlnet-inpaint-dreamer-sdxl. They also use a custom sampler for the inpainting, but I agree, it would be nice to be able to use those in diffusers. You can read about it here: lllyasviel/Fooocus#414 |
Beta Was this translation helpful? Give feedback.
-
I was reading the code and they download the model here: https://github.com/lllyasviel/Fooocus/blob/dc5b5238c83c63b4d7814ba210da074ddc341213/modules/config.py#L398-L399 This function is called here: https://github.com/lllyasviel/Fooocus/blob/dc5b5238c83c63b4d7814ba210da074ddc341213/modules/async_worker.py#L301 You can see After model is loaded you can see in following tabs that they apply the head in top of the result of applying the lora |
Beta Was this translation helpful? Give feedback.
-
I have read the comparison between Fooocus and comfyui of loading lora. I think they are basically the same. This can also be confirmed from the code provided by @WaterKnight1998. it just defined different names to ensure that only fooocus can load it correctly. |
Beta Was this translation helpful? Give feedback.
-
Yup, that's the problem I saw. I had a difficult time trying to load in diffusers I didn't managed to map keys of layers into diffusers expected format :( |
Beta Was this translation helpful? Give feedback.
-
https://github.com/lllyasviel/Fooocus/blob/main/modules/inpaint_worker.py#L187 Another thing worth considering is how to implement this patch for inpaint head model. |
Beta Was this translation helpful? Give feedback.
-
Ok, both codes are the same. Is it possible to load ComfyUI weights in diffusers? |
Beta Was this translation helpful? Give feedback.
-
But the code is just updating the first conv, no? |
Beta Was this translation helpful? Give feedback.
-
You are right, but we also need to use it in diffusers as input to start with |
Beta Was this translation helpful? Give feedback.
-
Maybe consider loading it in comfy and saving it as overall weights and then using it in diffusers? |
Beta Was this translation helpful? Give feedback.
-
But as I saw in fooocus, the base model will still be used in the second stage, so the most elegant way is to load and unload it freely. |
Beta Was this translation helpful? Give feedback.
-
What do you mean with this? |
Beta Was this translation helpful? Give feedback.
-
For example, in fooocus inpainting, assuming that 30 steps of sampling are performed, xl_base_model + inpainting_model will be used in the first 15 steps, and xl_base_model will be switched to separate inference in the last 15 steps. |
Beta Was this translation helpful? Give feedback.
-
yeah I saw it afterwards, they switched to a custom model for inpainting, how good is the inpainting? can any of you post an example? if its really good maybe I can try or even better, someone from the diffusers team, but they'll probably need solid proof to work on it. |
Beta Was this translation helpful? Give feedback.
-
before: |
Beta Was this translation helpful? Give feedback.
-
Nice, I like the challenge, let me get back at you soon since I still haven't done any outpainting with diffusers and I don't think there's a pipeline or workflow for that yet. I plan to do a guide/example/tutorial for inpainting and outpanting soon. I'll work in an outpainting solution so I can tackle this first, but IMO is the same, just need to solve the math for expanding the "canvas" and probably need to fill it with something first, not just noise.
for the woman this is the speed I get with a 3090: in this comment: exx8/differential-diffusion#17 (comment) the author says it is just a 0.25% penalty. But I'll also do it with normal inpainting because the results are also good, I like to use diff-diff but normal inpainting is not bad, the trick to it is the image area we use as a context and how we merge back the inpainted part. |
Beta Was this translation helpful? Give feedback.
-
I agree with you, the original image is crucial to the generation process, diffusion model are trained to do that. So for the outpainting here, I would use lama first to fix outpaint area. |
Beta Was this translation helpful? Give feedback.
-
i test it in comfy,with 2 methods:24 steps for all |
Beta Was this translation helpful? Give feedback.
-
I always thought that the model patching that fooocus does is just to convert a regular model to an inpainting one, you can do the same by merging the difference of the inpaint model trained by the diffusers team with the base model, that's is what most people did with SD 1.5. The first "inpainting" of fooocus was a controlnet too, so I don't know if there's something else in the patch or if he trained it from scratch or used the diffusers model, so I left it better as an "unknown" patch. Thank you for doing more tests, I have more to compare my results with that. This time, I’m putting more effort to do the best I can, instead of just simply replicating fooocus or comfyui. Edit: the VRAM and RAM can be managed, I remember that fooocus has to unload and load the model so it probably clones the base model (taking more RAM), also I think comfyui manages better the memory than fooocus since comfyui can run in a potato pc, so it should unload the model that is not using. In diffusers you practically can do whatever you want if you have the knowledge. |
Beta Was this translation helpful? Give feedback.
-
just to have a baseline, I tested the wolf one with just the controlnet inpaint, I use my app for this, but it can be done with just code: I don't think is that worse, but If I want to make it better, I can use the prompt enhancer and fix the composition with a t2i adapter (fixed a bit of the tail with painting in the canny preprocessed image) not bad for a quick inference, I'm going to do this also with just diffusers code. |
Beta Was this translation helpful? Give feedback.
-
Hi @Laidawang, I just posted a guide on the discussions on outpainting, I did a middle step without changing the prompt so you can compare it to the fooocus result. I'll use your other images in the other methods I know because they are more suited for them. Let me know if you still think fooocus is better but IMO they're of the same quality or better. |
Beta Was this translation helpful? Give feedback.
-
thanks for you sharing, i found the Fooocus inpaint lora weight contains (unit8, fp16, fp16) data, can anyone explain uint8 weights here? |
Beta Was this translation helpful? Give feedback.
-
@viperyl if i remember correctly they quantized the main matrices to uint8 to take less space and then use min/max stored in fp16 to scale them back. IMO very good idea with negligible loss of information |
Beta Was this translation helpful? Give feedback.
-
Yes, I debuged it and found uint8 quant, that's make me feel confused. The uint8 checkpoint needs 1.4 Gb disk space, but fp16 version only needs 2.5 Gb. Considering the quantization protentially damaged result, make a quant here for saving 1.1 Gb disk space looks not good idea. |
Beta Was this translation helpful? Give feedback.
-
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Beta Was this translation helpful? Give feedback.
-
Has anyone gotten it to work with the Hugging Face pipeline? |
Beta Was this translation helpful? Give feedback.
-
@CristianCuadrado i did, but then switched to controlnet Union, it produces better results compared to Fooocus, no reason to use it |
Beta Was this translation helpful? Give feedback.
-
Wow, it looks amazing! Thank you! |
Beta Was this translation helpful? Give feedback.
-
I did a guide that shows a way to use it, and two spaces that showcase the power it has. Diffusers Image Fill and Diffusers Fast Inpaint, also there's another based on it that was even more popular: Diffusers Image Outpaint that has also a repo Also in the official repo, the author posted the code on how to use it. |
Beta Was this translation helpful? Give feedback.
-
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Beta Was this translation helpful? Give feedback.
-
Is your feature request related to a problem? Please describe.
I have seen that diffusers StableDiffusionXLInpaintPipeline generates worse results than SD 1.5 pipeline.
Describe the solution you'd like.
Include Fooocus inpaint patch, you could specify with a new loader.
Weights are available right now in hub.
https://huggingface.co/lllyasviel/fooocus_inpaint
Beta Was this translation helpful? Give feedback.
All reactions