[BUG/Help] TypeError reported when model.chat() called #1498

lixjohn · 2024-11-18T14:24:06Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

To run chatglm-6b-int4 on my local machine, I use example code as follow
_from transformers import AutoTokenizer, AutoModel
import torch

modelname = "D:\\models\\zhipu\\chatglm-6b-int4"

tokenizer = AutoTokenizer.from_pretrained(modelname, trust_remote_code=True)
model = AutoModel.from_pretrained(modelname, trust_remote_code=True).float()
model = model.quantize(bits=4, kernel_file="D:\\models\\zhipu\\chatglm-6b-int4\\quantization_kernels_parallel.so")
message = '你好'
response, history = model.chat(tokenizer, message, history=[])
print(response)

response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response)_

However, I got TypeError: expected Tensor as element 0 in argument 0, but got tuple when I run this code, please see error messsage in attach file[
Error Message.txt
](url)

I traced code, found the error reported by line 254 in modeling_chatglm.py.
if layer_past is not None:
past_key, past_value = layer_past[0], layer_past[1]
if (type(layer_past) != str):
key_layer = torch.cat((past_key, key_layer), dim=0)
value_layer = torch.cat((past_value, value_layer), dim=0)

for some unknow reasons, layer_past was set as string 'past_key_values'

So past_key='p' and past_value='a', which caused argument mismatch error in torch.cat

More information for files update

In order to make code runing on CPU, I changed line 161 in quantization.py
from #kernels = ctypes.cdll.LoadLibrary(kernel_file)
to kernels = ctypes.CDLL(kernel_file,winmode=0)
In order to avoid sp_tokenizer is not defined error, I move code
self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) from 205 to 182 in tokenization_chatglm.py

Expected Behavior

No error in model.chat call

Steps To Reproduce

Please see Behavior section

Environment

- OS:Windows 11 家庭中文版
- Python: Python 3.12.7
- Transformers:Version: 4.42.0
- PyTorch:Version: 2.5.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : False

Anything else?

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG/Help] TypeError reported when model.chat() called #1498

[BUG/Help] TypeError reported when model.chat() called #1498

lixjohn commented Nov 18, 2024

[BUG/Help] TypeError reported when model.chat() called #1498

[BUG/Help] TypeError reported when model.chat() called #1498

Comments

lixjohn commented Nov 18, 2024

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?