Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: LoraConfig.__init__() got an unexpected keyword argument 'eva_config' #2275

Open
2 of 4 tasks
Mohankrish08 opened this issue Dec 11, 2024 · 7 comments
Open
2 of 4 tasks

Comments

@Mohankrish08
Copy link

Mohankrish08 commented Dec 11, 2024

System Info

Name: Peft

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

1 model_name = "Mohan-08/math-dataset-deepmind-FT"
2 tokenizer = AutoTokenizer.from_pretrained(model_name)
----> 3 model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", trust_remote_code=True)

Expected behavior

Facing this error:

TypeError: LoraConfig.init() got an unexpected keyword argument 'eva_config'

image

@BenjaminBossan
Copy link
Member

This means that the model was trained with a more recent PEFT version than what you're using. Either upgrade to the latest PEFT version or manually edit the adapter_config.json and remove the entry:

https://huggingface.co/Mohan-08/math-dataset-deepmind-FT/blob/main/adapter_config.json#L6

@Mohankrish08
Copy link
Author

This is my fine turned model:

model_name = "Mohan-08/math-dataset-deepmind-FT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to(device)

but this is model is behaving like the base model like gemma. what was the problem with this.

I have files like this
image

when i load my model also, it is loading the gemma weights!

@BenjaminBossan
Copy link
Member

Could you show me the code you used that results in the same results as the base model? When I tested it, I get different logits from the base model vs the fine tuned model.

@Mohankrish08
Copy link
Author

Mohankrish08 commented Dec 12, 2024

model_name = "Mohan-08/math-dataset-deepmind-FT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", trust_remote_code=True)

input_text = "Solve 273*o + 19 = 272*o - 2*t, -2*o + 5*t + 34 = 0 for o."
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(inputs.input_ids, max_length=1000)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

I just changed the model name, others are same.

Fine Turned output:

Solve 273o + 19 = 272o - 2t, -2o + 5*t + 34 = 0 for o.

Here's how to solve the system of equations:

Step 1: Isolate 'o' in one of the equations.

Let's isolate 'o' in the second equation:

-2o + 5t + 34 = 0
-2o = -5t - 34
o = (5*t + 34) / 2

Step 2: Substitute the value of 'o' into the other equation.

Substitute the expression for 'o' that we just found into the first equation:

273 * ((5t + 34) / 2) + 19 = 272 * (5t + 34) / 2 - 2*t

Step 3: Simplify and solve for 't'.

  • Multiply both sides of the equation by 2 to get rid of the fractions.
  • Expand the terms.
  • Combine like terms.
  • Isolate 't'.

This will give you a value for 't'.

Step 4: Substitute the value of 't' back into the equation for 'o'.

Once you have the value of 't', substitute it back into the equation you isolated 'o' in. This will give you the value of 'o'.

Let me know if you'd like me to walk through the full simplification and solving process!

Base model:

Solve 273o + 19 = 272o - 2t, -2o + 5*t + 34 = 0 for o.

Here's how to solve the system of equations:

Step 1: Isolate 'o' in one of the equations.

Let's isolate 'o' in the first equation:

273o + 19 = 272o - 2t
273
o - 272o = -2t - 19
o = -2*t - 19 / 273

Step 2: Substitute the value of 'o' into the second equation.

Substitute the expression for 'o' that we just found into the second equation:

-2*(-2t - 19 / 273) + 5t + 34 = 0

Step 3: Simplify and solve for 't'.

  • 4t + 38 / 273 + 5t + 34 = 0
  • 9*t + 38 / 273 = -34
  • 9*t = -34 * 273 - 38
  • 9*t = -9282 - 38
  • 9*t = -9320
  • t = -9320 / 9
  • t = -1036.67 (approximately)

Step 4: Substitute the value of 't' back into the equation for 'o'.

Now that you know the value of 't', substitute it back into the equation we isolated 'o' in:

o = -2*t - 19 / 273
o = -2 * (-1036.67) - 19 / 273
o = 2073.34 - 19 / 273
o = 2071.45 (approximately)

Therefore, the solution for 'o' is approximately 2071.45.

My doubt was, Fine Turned model also loads the same weights? but the output seems different

@BenjaminBossan
Copy link
Member

My doubt was, Fine Turned model also loads the same weights? but the output seems different

The fine-tuned model will load the base weights and then add the LoRA weights on top, so this is expected. As you mentioned, the outputs are different, so I think everything works as expected.

@goin2crazy
Copy link

This may be because of you trying to use the old version of peft library with new version chechpoint. I solved that with just removing the eva config

There how you can do that

import shutil
import os
import json
from peft import LoraConfig

# Define the path to the adapter_config.json file
adapter_config_path = f"{cfg.lora_dir}/adapter_config.json"

# Step 1: Read the adapter_config.json file
with open(adapter_config_path, 'r') as file:
    adapter_config = json.load(file)

# Step 2: Remove the eva_config key if it exists
adapter_config.pop('eva_config', None)
adapter_config.pop('exclude_modules', None)
adapter_config.pop('lora_bias', None)


# Step 3: Create the checkpoint_save_dir folder if it doesn't exist
checkpoint_save_dir = cfg.lora_dir.split("/")[-1]
if not os.path.exists(checkpoint_save_dir):
    os.makedirs(checkpoint_save_dir)

# Step 4: Define the path to save the modified adapter_config.json
checkpoint_config_save_path = f'{checkpoint_save_dir}/adapter_config.json'

# Step 5: Save the changes back to the adapter_config.json file
with open(checkpoint_config_save_path, 'w') as file:
    json.dump(adapter_config, file, indent=4)


for file_name in os.listdir(cfg.lora_dir):
    source_file_path = os.path.join(cfg.lora_dir, file_name)
    destination_file_path = os.path.join(checkpoint_save_dir, file_name)
    
    # Skip copying adapter_config.json since we already modified and saved it
    if file_name != "adapter_config.json":
        if os.path.isfile(source_file_path):
            shutil.copy2(source_file_path, destination_file_path)
        elif os.path.isdir(source_file_path):
            shutil.copytree(source_file_path, destination_file_path)

# Step 6: Load the LoraConfig using the modified configuration
lora_config = LoraConfig.from_pretrained(checkpoint_save_dir)
print(lora_config)

model_0 = PeftModel.from_pretrained(model_0, checkpoint_save_dir)
model_1 = PeftModel.from_pretrained(model_1, checkpoint_save_dir)

@hunter2009pf
Copy link

This means that the model was trained with a more recent PEFT version than what you're using. Either upgrade to the latest PEFT version or manually edit the adapter_config.json and remove the entry:

https://huggingface.co/Mohan-08/math-dataset-deepmind-FT/blob/main/adapter_config.json#L6

awesome man~ I merged original model weights with adapter successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants