Memory-efficient Diffusion Transformers with Quanto and Diffusers #9011

sayakpaul started this conversation in Show and tell

sayakpaul
Jul 30, 2024
Maintainer

With larger and larger diffusion transformers coming up, it's becoming increasingly important to have some good quantization tools for them.

We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.

We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.

Diffusers 🤝 Quanto ❤️

Check out the blog post to know more:
https://huggingface.co/blog/quanto-diffusers

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment