Skip to content

v0.7.0: Orthogonal Fine-Tuning, Megatron support, better initialization, safetensors, and more

Compare
Choose a tag to compare
@BenjaminBossan BenjaminBossan released this 06 Dec 16:13
· 457 commits to main since this release
2665f80

Highlights

  • Orthogonal Fine-Tuning (OFT): A new adapter that is similar to LoRA and shows a lot of promise for Stable Diffusion, especially with regard to controllability and compositionality. Give it a try! By @okotaku in #1160
  • Support for parallel linear LoRA layers using Megatron. This should lead to a speed up when using LoRA with Megatron. By @zhangsheng377 in #1092
  • LoftQ provides a new method to initialize LoRA layers of quantized models. The big advantage is that the LoRA layer weights are chosen in a way to minimize the quantization error, as described here: https://arxiv.org/abs/2310.08659. By @yxli2123 in #1150.

Other notable additions

  • It is now possible to choose which adapters are merged when calling merge (#1132)
  • IA³ now supports adapter deletion, by @alexrs (#1153)
  • A new initialization method for LoRA has been added, "gaussian" (#1189)
  • When training PEFT models with new tokens being added to the embedding layers, the embedding layer is now saved by default (#1147)
  • It is now possible to mix certain adapters like LoRA and LoKr in the same model, see the docs (#1163)
  • We started an initiative to improve the documenation, some of which should already be reflected in the current docs. Still, help by the community is always welcome. Check out this issue to get going.

Migration to v0.7.0

  • Safetensors are now the default format for PEFT adapters. In practice, users should not have to change anything in their code, PEFT takes care of everything -- just be aware that instead of creating a file adapter_model.bin, calling save_pretrained now creates adapter_model.safetensors. Safetensors have numerous advantages over pickle files (which is the PyTorch default format) and well supported on Hugging Face Hub.
  • When merging multiple LoRA adapter weights together using add_weighted_adapter with the option combination_type="linear", the scaling of the adapter weights is now performed differently, leading to improved results.
  • There was a big refactor of the inner workings of some PEFT adapters. For the vast majority of users, this should not make any difference (except making some code run faster). However, if your code is relying on PEFT internals, be aware that the inheritance structure of certain adapter layers has changed (e.g. peft.lora.Linear is no longer a subclass of nn.Linear, so isinstance checks may need updating). Also, to retrieve the original weight of an adapted layer, now use self.get_base_layer().weight, not self.weight (same for bias).

What's Changed

As always, a bunch of small improvements, bug fixes and doc improvements were added. We thank all the external contributors, both new and recurring. Below is the list of all changes since the last release.

New Contributors

Full Changelog: v0.6.2...v0.7.0

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@alexrs

@callanwu

@elyxlz

  • Fix issue where you cannot call PeftModel.from_pretrained with a private adapter by @elyxlz in #1076

@lukaskuhn-lku

@okotaku

@yxli2123

  • LoftQ: Add LoftQ method integrated into LoRA. Add example code for LoftQ usage. by @yxli2123 in #1150

@zhangsheng377