Apply applicable quantization_config
to model components when loading a model
#10327
Labels
quantization_config
to model components when loading a model
#10327
With new improvements to
quantization_config
, memory requirements of models such as SD35 and FLUX.1 are much lower.However, user must load each model component that he wants quantized manually and then assemble the pipeline.
For example:
The ask is to allow pipeline loader itself to process
quantization_config
and automatically use it on applicable modules if its presentThat would allow much simpler use without user needing to know exact internal components of the each model:
This is a generic ask that should work for pretty much all models, although primary use case is with the most popular models such as SD35 and FLUX.1
@yiyixuxu @sayakpaul @DN6 @asomoza
The text was updated successfully, but these errors were encountered: