v0.7.0: Orthogonal Fine-Tuning, Megatron support, better initialization, safetensors, and more
Highlights
- Orthogonal Fine-Tuning (OFT): A new adapter that is similar to LoRA and shows a lot of promise for Stable Diffusion, especially with regard to controllability and compositionality. Give it a try! By @okotaku in #1160
- Support for parallel linear LoRA layers using Megatron. This should lead to a speed up when using LoRA with Megatron. By @zhangsheng377 in #1092
- LoftQ provides a new method to initialize LoRA layers of quantized models. The big advantage is that the LoRA layer weights are chosen in a way to minimize the quantization error, as described here: https://arxiv.org/abs/2310.08659. By @yxli2123 in #1150.
Other notable additions
- It is now possible to choose which adapters are merged when calling
merge
(#1132) - IA³ now supports adapter deletion, by @alexrs (#1153)
- A new initialization method for LoRA has been added,
"gaussian"
(#1189) - When training PEFT models with new tokens being added to the embedding layers, the embedding layer is now saved by default (#1147)
- It is now possible to mix certain adapters like LoRA and LoKr in the same model, see the docs (#1163)
- We started an initiative to improve the documenation, some of which should already be reflected in the current docs. Still, help by the community is always welcome. Check out this issue to get going.
Migration to v0.7.0
- Safetensors are now the default format for PEFT adapters. In practice, users should not have to change anything in their code, PEFT takes care of everything -- just be aware that instead of creating a file
adapter_model.bin
, callingsave_pretrained
now createsadapter_model.safetensors
. Safetensors have numerous advantages over pickle files (which is the PyTorch default format) and well supported on Hugging Face Hub. - When merging multiple LoRA adapter weights together using
add_weighted_adapter
with the optioncombination_type="linear"
, the scaling of the adapter weights is now performed differently, leading to improved results. - There was a big refactor of the inner workings of some PEFT adapters. For the vast majority of users, this should not make any difference (except making some code run faster). However, if your code is relying on PEFT internals, be aware that the inheritance structure of certain adapter layers has changed (e.g.
peft.lora.Linear
is no longer a subclass ofnn.Linear
, soisinstance
checks may need updating). Also, to retrieve the original weight of an adapted layer, now useself.get_base_layer().weight
, notself.weight
(same forbias
).
What's Changed
As always, a bunch of small improvements, bug fixes and doc improvements were added. We thank all the external contributors, both new and recurring. Below is the list of all changes since the last release.
- After release: Bump version to 0.7.0.dev0 by @BenjaminBossan in #1074
- FIX: Skip adaption prompt tests with new transformers versions by @BenjaminBossan in #1077
- FIX: fix adaptation prompt CI and compatibility with latest transformers (4.35.0) by @younesbelkada in #1084
- Improve documentation for IA³ by @SumanthRH in #984
- [
Docker
] Update Dockerfile to force-use transformers main by @younesbelkada in #1085 - Update the release checklist by @BenjaminBossan in #1075
- fix-gptq-training by @SunMarc in #1086
- fix the failing CI tests by @pacman100 in #1094
- Fix f-string in import_utils by @KCFindstr in #1091
- Fix IA3 config for Falcon models by @SumanthRH in #1007
- FIX: Failing nightly CI tests due to IA3 config by @BenjaminBossan in #1100
- [
core
] Fix safetensors serialization for shared tensors by @younesbelkada in #1101 - Change to 0.6.1.dev0 by @younesbelkada in #1102
- Release: 0.6.1 by @younesbelkada in #1103
- set dev version by @younesbelkada in #1104
- avoid unnecessary import by @winglian in #1109
- Refactor adapter deletion by @BenjaminBossan in #1105
- Added num_dataloader_workers arg to fix Windows issue by @lukaskuhn-lku in #1107
- Fix import issue transformers with
id_tensor_storage
by @younesbelkada in #1116 - Correctly deal with
ModulesToSaveWrapper
when using Low-level API by @younesbelkada in #1112 - fix doc typo by @coding-famer in #1121
- Release: v0.6.2 by @pacman100 in #1125
- Release: v0.6.3.dev0 by @pacman100 in #1128
- FIX: Adding 2 adapters when target_modules is a str fails by @BenjaminBossan in #1111
- Prompt tuning: Allow to pass additional args to AutoTokenizer.from_pretrained by @BenjaminBossan in #1053
- Fix: TorchTracemalloc ruins Windows performance by @lukaskuhn-lku in #1126
- TST: Improve requires grad testing: by @BenjaminBossan in #1131
- FEAT: Make safe serialization the default one by @younesbelkada in #1088
- FEAT: Merging only specified
adapter_names
when callingmerge
by @younesbelkada in #1132 - Refactor base layer pattern by @BenjaminBossan in #1106
- [
Tests
] Fix daily CI by @younesbelkada in #1136 - [
core
/LoRA
] Addadapter_names
in bnb layers by @younesbelkada in #1139 - [
Tests
] Do not stop tests if a job failed by @younesbelkada in #1141 - CI Add Python 3.11 to test matrix by @BenjaminBossan in #1143
- FIX: A few issues with AdaLora, extending GPU tests by @BenjaminBossan in #1146
- Use
huggingface_hub.file_exists
instead of custom helper by @Wauplin in #1145 - Delete IA3 adapter by @alexrs in #1153
- [Docs fix] Relative path issue by @mishig25 in #1157
- Dataset was loaded twice in 4-bit finetuning script by @lukaskuhn-lku in #1164
- fix
add_weighted_adapter
method by @pacman100 in #1169 - (minor) correct type annotation by @vwxyzjn in #1166
- Update release checklist about release notes by @BenjaminBossan in #1170
- [docs] Migrate doc files to Markdown by @stevhliu in #1171
- Fix dockerfile build by @younesbelkada in #1177
- FIX: Wrong use of base layer by @BenjaminBossan in #1183
- [
Tests
] Migrate to AWS runners by @younesbelkada in #1185 - Fix code example in quicktour.md by @merveenoyan in #1181
- DOC Update a few places in the README by @BenjaminBossan in #1152
- Fix issue where you cannot call PeftModel.from_pretrained with a private adapter by @elyxlz in #1076
- Added lora support for phi by @umarbutler in #1186
- add options to save or push model by @callanwu in #1159
- ENH: Different initialization methods for LoRA by @BenjaminBossan in #1189
- Training PEFT models with new tokens being added to the embedding layers and tokenizer by @pacman100 in #1147
- LoftQ: Add LoftQ method integrated into LoRA. Add example code for LoftQ usage. by @yxli2123 in #1150
- Parallel linear Lora by @zhangsheng377 in #1092
- [Feature] Support OFT by @okotaku in #1160
- Mixed adapter models by @BenjaminBossan in #1163
- [DOCS] README.md by @Akash190104 in #1054
- Fix parallel linear lora by @zhangsheng377 in #1202
- ENH: Enable OFT adapter for mixed adapter models by @BenjaminBossan in #1204
- DOC: Update & improve docstrings and type annotations for common methods and classes by @BenjaminBossan in #1201
- remove HF tokens by @yxli2123 in #1207
- [docs] Update index and quicktour by @stevhliu in #1191
- [docs] API docs by @stevhliu in #1196
- MNT: Delete the delete doc workflows by @BenjaminBossan in #1213
- DOC: Initialization options for LoRA by @BenjaminBossan in #1218
- Fix an issue with layer merging for LoHa and OFT by @lukaskuhn-lku in #1210
- DOC: How to configure new transformers models by @BenjaminBossan in #1195
- Raise error when
modules_to_save
is specified and multiple adapters are being unloaded by @pacman100 in #1137 - TST: Add regression tests 2 by @BenjaminBossan in #1115
- Release: 0.7.0 by @BenjaminBossan in #1214
New Contributors
- @KCFindstr made their first contribution in #1091
- @winglian made their first contribution in #1109
- @lukaskuhn-lku made their first contribution in #1107
- @coding-famer made their first contribution in #1121
- @Wauplin made their first contribution in #1145
- @alexrs made their first contribution in #1153
- @merveenoyan made their first contribution in #1181
- @elyxlz made their first contribution in #1076
- @umarbutler made their first contribution in #1186
- @callanwu made their first contribution in #1159
- @yxli2123 made their first contribution in #1150
- @zhangsheng377 made their first contribution in #1092
- @okotaku made their first contribution in #1160
- @Akash190104 made their first contribution in #1054
Full Changelog: v0.6.2...v0.7.0
Significant community contributions
The following contributors have made significant changes to the library over the last release:
- Fix issue where you cannot call PeftModel.from_pretrained with a private adapter by @elyxlz in #1076
- Fix: TorchTracemalloc ruins Windows performance by @lukaskuhn-lku in #1126
- Dataset was loaded twice in 4-bit finetuning script by @lukaskuhn-lku in #1164
- LoftQ: Add LoftQ method integrated into LoRA. Add example code for LoftQ usage. by @yxli2123 in #1150
- Parallel linear Lora by @zhangsheng377 in #1092
- Fix parallel linear lora by @zhangsheng377 in #1202