Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] TypeError reported when model.chat() called #1498

Open
1 task done
lixjohn opened this issue Nov 18, 2024 · 0 comments
Open
1 task done

[BUG/Help] TypeError reported when model.chat() called #1498

lixjohn opened this issue Nov 18, 2024 · 0 comments

Comments

@lixjohn
Copy link

lixjohn commented Nov 18, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

To run chatglm-6b-int4 on my local machine, I use example code as follow
_from transformers import AutoTokenizer, AutoModel
import torch

modelname = "D:\\models\\zhipu\\chatglm-6b-int4"

tokenizer = AutoTokenizer.from_pretrained(modelname, trust_remote_code=True)
model = AutoModel.from_pretrained(modelname, trust_remote_code=True).float()
model = model.quantize(bits=4, kernel_file="D:\\models\\zhipu\\chatglm-6b-int4\\quantization_kernels_parallel.so")
message = '你好'
response, history = model.chat(tokenizer, message, history=[])
print(response)

response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response)_

However, I got TypeError: expected Tensor as element 0 in argument 0, but got tuple when I run this code, please see error messsage in attach file[
Error Message.txt
](url)

I traced code, found the error reported by line 254 in modeling_chatglm.py.
if layer_past is not None:
past_key, past_value = layer_past[0], layer_past[1]
if (type(layer_past) != str):
key_layer = torch.cat((past_key, key_layer), dim=0)
value_layer = torch.cat((past_value, value_layer), dim=0)

for some unknow reasons, layer_past was set as string 'past_key_values'
1731938883153
So past_key='p' and past_value='a', which caused argument mismatch error in torch.cat

More information for files update

  1. In order to make code runing on CPU, I changed line 161 in quantization.py
    from #kernels = ctypes.cdll.LoadLibrary(kernel_file)
    to kernels = ctypes.CDLL(kernel_file,winmode=0)
  2. In order to avoid sp_tokenizer is not defined error, I move code
    self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) from 205 to 182 in tokenization_chatglm.py

Expected Behavior

No error in model.chat call

Steps To Reproduce

Please see Behavior section

Environment

- OS:Windows 11 家庭中文版
- Python: Python 3.12.7
- Transformers:Version: 4.42.0
- PyTorch:Version: 2.5.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : False

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant