You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we do not implement framewise encoding/decoding in the LTX Video VAE. This leads to an opportunity for reducing memory usage, which will be beneficial for both inference and training.
LoRA finetuning LTX Video on 49x512x768 videos can be done in under 6 GB if prompts and latents are pre-computed, but the pre-computation requires about 12 GB of memory because of the VAE encode/decode. This can be reduced by a considerable amount and lower the bar for entry into video model finetuning. Our friends with potatoes need you!
Currently, we do not implement framewise encoding/decoding in the LTX Video VAE. This leads to an opportunity for reducing memory usage, which will be beneficial for both inference and training.
LoRA finetuning LTX Video on 49x512x768 videos can be done in under 6 GB if prompts and latents are pre-computed, but the pre-computation requires about 12 GB of memory because of the VAE encode/decode. This can be reduced by a considerable amount and lower the bar for entry into video model finetuning. Our friends with potatoes need you!
diffusers/src/diffusers/models/autoencoders/autoencoder_kl_ltx.py
Line 949 in d413881
As always, contributions are welcome 🤗 Happy new year!
The text was updated successfully, but these errors were encountered: