Guidance Needed on Two-Stage Fine-Tuning with LoRA(SFT and DPO) for Model Adaptation #2264

none0663 · 2024-12-06T13:35:20Z

I am planning to perform a two-stage fine-tuning process and need some guidance on how to proceed.

First Stage

Load Base Model: I start by loading the base model, qwen1.5 32B.
Apply LoRA Fine-Tuning: I then apply LoRA fine-tuning to this base model and obtain a new model state.
Save Adapter Model: This fine-tuned model state is saved as adapter_model.safetensors, named qwen1.5_lora_sft.

Second Stage

Load the Model from the First Stage: I load both qwen1.5 32B and qwen1.5_lora_sft. It's crucial that qwen1.5_lora_sft integrates correctly with the base model qwen1.5 32B.
. Continue Fine-Tuning: On this model, which already includes the LoRA adapter, I continue to apply LoRA and DPO for further fine-tuning.
Save the New Adapter Model: After fine-tuning, I need to save the new adapter state, which includes adjustments from both the original LoRA and the new DPO.

My questions are:

How to load the model from the base model(qwen1.5 32B) with the lora module qwen1.5_lora_sft
How to Continue Fine-Tuning from the First Stage model, and save the lora model after dpo training with the base model(qwen1.5 32B) and only one qwen1.5_lora_sft_dpo module.( adapter_model_sft_dpo.safetensors)

What I had now

base model, qwen1.5 32B model path
qwen1.5_lora_sft module path: adapter_model.safetensors

What I Need

qwen1.5_lora_sft _dpo module: adapter_model_sft_dpo.safetensors

This is

train a base_model to get LoRA_weights_1
base_model_1 = merge(base_model and LoRA_weights_1)
train base_model_1 to get LoRA_weights_2
base_model_2 = merge(base_model_1 and LoRA_weights_2)

how to split the base_model_2 into base_model and LoRA_weights_1_2

Thinks!

none0663 · 2024-12-06T13:41:26Z

https://github.com/kohya-ss/sd-scripts/blob/main/networks/merge_lora.py
Maybe I can use this to merge two lora stage modules to one ?

BenjaminBossan · 2024-12-06T14:29:10Z

The answer to your question depends a little bit on what you want to do afterwards. Let me sketch a response.

Let's call the base model M, the first LoRA adapter L1 and the second LoRA adapter L2.

After you've trained L1 and save it, you have the adapter_model.safetensors file for L1 as you mentioned. From here, we have a few possibilities. Which one is the best for you depends on your use case.

1 Continue training on the same adapter

from peft import PeftModel

base_model = ...  # load the base model M as normal
model = PeftModel.from_pretrained(base_model, <path-to-L1>, is_trainable=True)  # <= important
...  # continue finetuning
model.save_pretrained(<path-to-L2>)

After doing this, you have a second LoRA adapter, L2, which is based on the finetuning knowledge from L1 and additionally has been trained for DPO.

For inference, you can load the 2nd adapter the same way as the 1st one shown above, but you don't need to set is_trainable=True, since this is for pure inference.

You can also load both adapters at the same time by loading the 2nd adapter with model.load_adapter. Switch between the two adapters by calling model.set_adapter(<adapter-name>).

2 Merging the first adapter into the base model

The second approach would work like this:

from peft import PeftModel

# first merge L1 into M
base_model = ...  # load the base model M as normal
model = PeftModel.from_pretrained(base_model, <path-to-L1>)
merged = model.merge_and_unload()
merged.save_pretrained(<path-to-merged>)

The merged model combines the weights of M and L1. The checkpoint of merged will be quite big, the same size as the base model's checkpoint. Now for further training:

merged = ...  # load the merged model from <path-to-merged>
lora_config2 = LoraConfig(...)  # config for L2
model = get_peft_model(merged, lora_config2)
...  # continue finetuning
model.save_pretrained(<path-to-L2>)

Here the big difference to the 1st approach is that instead of using the 1st LoRA adapter for further finetuning, we merge it into the base model and create a 2nd LoRA adapter. This 2nd LoRA adapter requires you to load the merged model as the base model instead of loading the original base model as in the 1st approach.

One advantage of the 2nd approach is that you can define a completely new LoraConfig if your 2nd step requires different hyper-parameters. However, when you want to do inference, you need to use the merged model as base model, which means you can't inference with the original base model anymore (unless you load it as a completely separate copy). I don't know if you need this or not.

Overall, what approach works better for you depends on your use case, hopefully this explanation makes the decision easier for you.

none0663 · 2024-12-06T15:07:20Z

1 Continue training on the same adapter
from peft import PeftModel

base_model = ...  # load the base model M as normal
model = PeftModel.from_pretrained(base_model, <path-to-L1>, is_trainable=True)  # <= important
...  # continue finetuning
model.save_pretrained(<path-to-L2>)
After doing this, you have a second LoRA adapter, L2, which is based on the finetuning knowledge from L1 and additionally has been trained for DPO.

For inference, you can load the 2nd adapter the same way as the 1st one shown above, but you don't need to set is_trainable=True, since this is for pure inference.

You can also load both adapters at the same time by loading the 2nd adapter with model.load_adapter. Switch between the two adapters by calling model.set_adapter(<adapter-name>).

Thank you very much for your reply; this part is exactly what I needed.
This part means that LoRA adapter, L2, is fine turning from the model = PeftModel.from_pretrained(base_model, <path-to-L1>) which had loaded the L1 module parameters already. And as for the LoRA adapter, L2, the base model is still the base model M, so the L1 and L2 can share the same base model.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guidance Needed on Two-Stage Fine-Tuning with LoRA(SFT and DPO) for Model Adaptation #2264

Guidance Needed on Two-Stage Fine-Tuning with LoRA(SFT and DPO) for Model Adaptation #2264

none0663 commented Dec 6, 2024 •

edited

Loading

none0663 commented Dec 6, 2024 •

edited

Loading

BenjaminBossan commented Dec 6, 2024

none0663 commented Dec 6, 2024

1 Continue training on the same adapter

Guidance Needed on Two-Stage Fine-Tuning with LoRA(SFT and DPO) for Model Adaptation #2264

Guidance Needed on Two-Stage Fine-Tuning with LoRA(SFT and DPO) for Model Adaptation #2264

Comments

none0663 commented Dec 6, 2024 • edited Loading

I am planning to perform a two-stage fine-tuning process and need some guidance on how to proceed.

First Stage

Second Stage

My questions are:

What I had now

What I Need

This is

none0663 commented Dec 6, 2024 • edited Loading

BenjaminBossan commented Dec 6, 2024

1 Continue training on the same adapter

2 Merging the first adapter into the base model

none0663 commented Dec 6, 2024

1 Continue training on the same adapter

none0663 commented Dec 6, 2024 •

edited

Loading

none0663 commented Dec 6, 2024 •

edited

Loading