-
Notifications
You must be signed in to change notification settings - Fork 210
Cannot run llama3 8b instruct: AssertionError: Fail to convert pytorch model
#1522
Comments
Thanks for reporting it, we will check the issue |
@N3RDIUM Hi, according to errors
It seems you did't download the model successfully. Please download the model from HF to the local disk and try again. Just setting the model_id to the local path.
Another issue is that the variable of model.device you didn't define |
I tried downloading the model again and using the local path as the model ID, but it gives me this error now:
|
Does this lib support *.pth models? I could go for the original/ dir: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/tree/main/original |
Hi,
The code you provided may be incompatible, whcih means ITREX or Neural Spedd verison is a little bit old. https://github.com/intel/neural-speed/blob/main/neural_speed/convert/convert_llama.py I ran the code successfully last time I replied you~. Please try to reinstall the latest main bracnh ITREX and neural speed from the souce code~ |
Okay, will try. Thanks for the quick reply! |
Its running out of memory on |
Whoops! Closed it by mistake. Anyway, is there any way to reduce memory usage when loading the model from HF? I tried without itrex and it runs just fine :( |
Great, now I get |
Hi, @N3RDIUM
All people use the same function to load the model from the HF: The possible different is that the https://github.com/intel/neural-speed/blob/main/neural_speed/convert/convert_llama.py#L1485 Please set the low_cpu_usage_mem=False before installation. According to my tests previously, it can reduce virtual memory sometimes.
No worries. Just setting the new conda env and reinstall the requirement.txt and ITREX+NS from the souce code. Theses issues will disappear I think. I have checked the installation pipeline again by using the latest ITREX and NS branch. It works. successful Installation screenshots(Check whether you install successfully) |
I have the same versions as you, yet it gives me the same error: |
Oops, did it again, extremely sorry |
I'm not using |
Here is the error now:
|
Which version of transformers and pytorch are you on? |
Facing the same issue for the given Dockerfile. |
Hey there! I'm trying to run llama3-8b-instruct with intel extension for transformers.
Here's my code:
Here's the error:
The text was updated successfully, but these errors were encountered: