feature: some hints #2

kalle07 · 2024-09-03T08:38:24Z

first thx... it works ;)

if i only check the box "Return foreground" (or i dont check anything)
always the mask image is saved -> its useful to only save the foreground-image

if i have a named file "image(123).png" the saved file after remove background is "image123.png" (i know backets sometimes heavy) but all local image generators save the file in brackets (a1111, webui, fooocus)

is it possible to process multible images at once (like multicore for CPU), or is that not on your code?

do you have any idea what the differences are between the checkpoints? ok "portrait" i know ...

dimitribarbot · 2024-09-03T13:22:26Z

Hi @kalle07,

if i only check the box "Return foreground" (or i dont check anything)
always the mask image is saved -> its useful to only save the foreground-image

I've pushed a modification to handle this use case. After extension update, by going to your "Extensions" tab and clicking on "Check for updates" and then "Apply and restart UI", you should now see:

Keep "Return mask" unchecked and it should not be returned anymore.

if i have a named file "image(123).png" the saved file after remove background is "image123.png" (i know backets sometimes heavy) but all local image generators save the file in brackets (a1111, webui, fooocus)

Unfortunately, the output file name is not handled by this extension but by Automatic1111's SD WebUI "Extras" tab. If you try with another extra, for instance "Upscale", you will get the same result. You should open an issue to their repository instead.

is it possible to process multible images at once (like multicore for CPU), or is that not on your code?

The original author of BiRefNet handles multiple images sequentially. You can ask for a new feature in the BiRefNet repository. If they implement it then it will probably be available in SD WebUI as I often keep this repository in sync with theirs.

do you have any idea what the differences are between the checkpoints? ok "portrait" i know ...

For these specific details, I think you would have more information by directly asking to the original author of BiRefNet.

UPDATE: For this last question, I've updated the README with basic information for each model:

The available models are:

- General: A pre-trained model for general use cases.
- General-Lite: A light pre-trained model for general use cases.
- Portrait: A pre-trained model for human portraits.
- DIS: A pre-trained model for dichotomous image segmentation (DIS).
- HRSOD: A pre-trained model for high-resolution salient object detection (HRSOD).
- COD: A pre-trained model for concealed object detection (COD).
- DIS-TR_TEs: A pre-trained model with massive dataset.

kalle07 · 2024-10-11T18:38:19Z

is it possible to support the models with a prompt ?

dimitribarbot · 2024-10-13T08:22:21Z

Yes, I guess this should be possible using the procedure used in the Segment Anything extension:

Use GroundingDINO to draw boxes on items described by a text input prompt.
Use these boxes and BiRefNet to remove the background from these bounding boxes only, as described in this notebook created by the BiRefNet original author.

However, many changes would need to be made to this extension.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: some hints #2

feature: some hints #2

kalle07 commented Sep 3, 2024

dimitribarbot commented Sep 3, 2024 •

edited

Loading

kalle07 commented Oct 11, 2024

dimitribarbot commented Oct 13, 2024

feature: some hints #2

feature: some hints #2

Comments

kalle07 commented Sep 3, 2024

dimitribarbot commented Sep 3, 2024 • edited Loading

kalle07 commented Oct 11, 2024

dimitribarbot commented Oct 13, 2024

dimitribarbot commented Sep 3, 2024 •

edited

Loading