Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not check trace for diffusers, saving memory and time for FLUX #1064

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mvafin
Copy link
Contributor

@mvafin mvafin commented Dec 11, 2024

What does this PR do?

This is further optimization of memory consumption of diffusers conversion, continuation of this PR: #1033
When check_trace=True the TorchScript graph is generated second time and is compared with the graph generated first time. It is useful to catch incorrect traced graph sometimes, but in optimum we control which models are supported and such issues shouldn't happen.
Currently only introduce this for diffusers, but can be done for all the models.
The most impact on memory is demonstrated for FLUX, for other diffusers it significantly reduces conversion time.

Model Before After
FLUX ~58Gb/~240s image ~33Gb/~130s image
SD-3.5-medium ~18Gb/~250s image ~18Gb/~100s image

|

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@AlexKoff88
Copy link
Collaborator

Awesome! Thanks @mvafin.

Please check and fix the quality:

black --check .
ruff check .

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@mvafin
Copy link
Contributor Author

mvafin commented Dec 12, 2024

ruff check .

Done

@AlexKoff88
Copy link
Collaborator

  • @alexsu52 . Shall we have an option that disables this behavior during model export? Such an option can be enabled by default.

Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of patches, long term solution would be to decouple the export process by doing torch.jit.trace with the arguments we want then ov.convert_model with the resulting model.

@AlexKoff88
Copy link
Collaborator

Not a fan of patches, long term solution would be decouple the export process by doing torch.jit.trace with the arguments we want then ov.convert_model with the resulting model.

I agree. I see that torch.jit.trace in one place in the Optimum-Intel code. @mvafin, why did you just not pass the argument there?

@mvafin
Copy link
Contributor Author

mvafin commented Dec 13, 2024

Not a fan of patches, long term solution would be decouple the export process by doing torch.jit.trace with the arguments we want then ov.convert_model with the resulting model.

I agree. I see that torch.jit.trace in one place in the Optimum-Intel code. @mvafin, why did you just not pass the argument there?

torch.jit.trace is called inside the ov.convert_model, there is no control over it from outside. We could do torch.jit.trace before calling convert_model but that would require to also do the necessary preparations we do inside the convert_model which we will not do for TorchScript module. This approach is not good.
Ideal solution would be to allow optimum-intel to set the flag when calling convert_model and this is what we want to do for 2025.0, but for 2024.6 unfortunately we only can use this patch.

@IlyasMoutawwakil
Copy link
Member

okay, we can go with a patch but let's at least make it a clean one, the naming in the PR is a bit confusing and I don't think the patching should be in __main__.py but rather in convert where lower level logic is defined. What about this:

  • add an argument check_trace=None to the export function(s) so that users can control this behavior.
  • if None, set this value to False in the case of diffusers.
  • do the patching around the convert_model function with patch_torch_jit_trace(check_trace=check_trace).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants