Skip to content

GPTQModel v1.5.1

Latest
Compare
Choose a tag to compare
@Qubitium Qubitium released this 01 Jan 08:39
· 41 commits to main since this release
4f18747

What's Changed

🎉 2025!

⚡ Added QuantizeConfig.device to clearly define which device is used for quantization: default = auto. Non-quantized models are always loaded on cpu by-default and each layer is moved to QuantizeConfig.device during quantization to minimize vram usage.
💫 Improve QuantLinear selection from optimum.
🐛 Fix attn_implementation_autoset compat in latest transformers.

Full Changelog: v1.5.0...v1.5.1