Memory-efficient Diffusion Transformers with Quanto and Diffusers #9011
sayakpaul
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
With larger and larger diffusion transformers coming up, it's becoming increasingly important to have some good quantization tools for them.
We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.
We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.
Diffusers 🤝 Quanto ❤️
Check out the blog post to know more:
https://huggingface.co/blog/quanto-diffusers
Beta Was this translation helpful? Give feedback.
All reactions