Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify chatglm3 6b #1119

Open
wants to merge 60 commits into
base: master
Choose a base branch
from

Conversation

Aniruddha521
Copy link

I proceed as mentioned in the task #259 with the following changes.
1) Extended the nightly_model in the file openvino.genai/tests/python_tests/ov_genai_test_utils.py

nightly_models = [
        "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
        "facebook/opt-125m",
        "microsoft/phi-1_5",
        "microsoft/phi-2",
        "THUDM/chatglm2-6b",
        "THUDM/chatglm3-6b", # no beam_search
        "Qwen/Qwen2-0.5B-Instruct",
        "Qwen/Qwen-7B-Chat",
        "Qwen/Qwen1.5-7B-Chat",
        "argilla/notus-7b-v1",
        "HuggingFaceH4/zephyr-7b-beta",
        "ikala/redpajama-3b-chat",
        "mistralai/Mistral-7B-v0.1",

2) Added model to

openvino.genai/.github/workflows/causal_lm_cpp.yml

Line 62 in 8470250
run: |

cpp-greedy_causal_lm-Chatglm3-6b and cpp-prompt_lookup_decoding_lm-ubuntu-Chatglm3-6b

3) Extended the supported_model_list and added note

Note

The beam_search_causal_lm is not supported in the ChatGLM3-6B model.

@github-actions github-actions bot added category: sampling Sampling / Decoding algorithms category: GHA CI based on Github actions labels Oct 31, 2024
.github/workflows/causal_lm_cpp.yml Outdated Show resolved Hide resolved
.github/workflows/causal_lm_cpp.yml Outdated Show resolved Hide resolved
tests/python_tests/ov_genai_test_utils.py Show resolved Hide resolved
@Aniruddha521
Copy link
Author

@Wovchena can you please guide me that what possibly gone wrong for cpp-greedy_causal_lm-Chatglm3-6b and cpp-prompt_lookup_decoding_lm-ubuntu although both greedy_causal_lm and prompt_lookup_decoding_lm works for chatglm3-6b in my local machine

@mlukasze mlukasze linked an issue Nov 4, 2024 that may be closed by this pull request
@ilya-lavrenov ilya-lavrenov removed the category: sampling Sampling / Decoding algorithms label Nov 5, 2024
@github-actions github-actions bot added the category: sampling Sampling / Decoding algorithms label Nov 5, 2024
@github-actions github-actions bot added the category: tokenizers Tokenizer class or submodule update label Nov 5, 2024
@Wovchena
Copy link
Collaborator

You need to find out why C++ and Python greedy produce different outputs and fix it. Other model runs are aligned, that means the problem is not about the samples themselves.

This PR diff shows that you've modified thirdparty/openvino_tokenizers. Eliminate that diff. master's version of openvino_tokenizers should be kept. May be this is the reason for Linux (Ubuntu 20.04, Python 3.9) / OpenVINO genai extension (cmake + wheel) (pull_request) failure.

@github-actions github-actions bot added no-match-files and removed category: sampling Sampling / Decoding algorithms labels Nov 22, 2024
@Aniruddha521
Copy link
Author

@Wovchena
Why I am getting build error and error for whisper_speech_recognition in Windows (VS 2019, Python 3.11)?
Did I miss anything?

@Wovchena
Copy link
Collaborator

Something got broken in one of dependencies. All new PRs fail the same way. This is not your fault.

@Aniruddha521
Copy link
Author

Aniruddha521 commented Nov 30, 2024

@Wovchena
I think there in no need for any further edit? this latest commit has resolved all the required checks.
Please correct me if I am wrong

samples/python/greedy_causal_lm/greedy_causal_lm.py ./chatglm3-6b/ "$(<prompt.txt)"| tee py.txt
echo '-----------------------------------------------------------------------------------------------'
diff cpp.txt py.txt
echo "Why sun is yellow?" passed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "Why sun is yellow?" passed

It's the last command in this step and there are no other tests in this step anyway. So completing it already signals that the test passed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 598 to 605
with open('predictions_greedy.txt', 'r') as f:
predicted_greedy = f.readline()
with open('predictions_prompt_lookup.txt', 'r') as f:
predicted_prompt_lookup = f.readline()
assert predicted_greedy == predicted_prompt_lookup
print('Passes')
"
echo "Prompt lookup" passed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with open('predictions_greedy.txt', 'r') as f:
predicted_greedy = f.readline()
with open('predictions_prompt_lookup.txt', 'r') as f:
predicted_prompt_lookup = f.readline()
assert predicted_greedy == predicted_prompt_lookup
print('Passes')
"
echo "Prompt lookup" passed
diff predictions_greedy.txt predictions_prompt_lookup.txt

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too tried with the same approach earlier but it end up giving error because diff predictions_greedy.txt predictions_prompt_lookup.txt is comparing the whole contain of the file which are not same as you can see

-------------------------------Prompt lookup Generated-----------------------------------------
Yes, I can add 2 and 3
B: The sum of 2 and 3 is 5

Answer: B

Explanation: The function `add` takes two arguments `a` and `b` and returns their sum. When we call the-------------------------------Greedy Generated------------------------------------------------
Yes, I can add 2 and 3
B: The sum of 2 and 3 is 5

Answer: B

Explanation: The function `add` takes two arguments `a` and `b` and returns their sum. When we call the function with arguments 2 and 3, it returns the sum of these two numbers, which is 5.
-----------------------------------------------------------------------------------------------

if I use diff predictions_greedy.txt predictions_prompt_lookup.txt command then the error will look like

6c6
< Explanation: The function `add` takes two arguments `a` and `b` and returns their sum. When we call the
---
> Explanation: The function `add` takes two arguments `a` and `b` and returns their sum. When we call the function with arguments 2 and 3, it returns the sum of these two numbers, which is 5.

Which is obvious since both are not same, but it passes in the assertion portion because predicted_prompt_lookup = Yes, I can add 2 and 3 #first line of the generated text using prompt decoding and predicted_greedy = Yes, I can add 2 and 3 #first line of the generated text using greedy decoding

Any suggestion from your side?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to come up with an explanation why the generated text is different and probably fix it.
It's should be something about EOS token because the beginning of the text matches.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iefode said that she encountered similar issue. So maybe it will be fixed

@@ -25,6 +25,7 @@ def get_models_list():
"microsoft/phi-1_5",
"microsoft/phi-2",
"THUDM/chatglm2-6b",
"THUDM/chatglm3-6b", # no beam_search
Copy link
Collaborator

@Wovchena Wovchena Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does every python test pass with THUDM/chatglm3-6b? If not, please, mark the failing tests to be skipped. Skips must happen only for that particular model.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For for python tests(including beam search) the output is showing Killed may be due to resource constraints.

image
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refer to python tests, not python samples

thirdparty/openvino_tokenizers Outdated Show resolved Hide resolved
src/docs/SUPPORTED_MODELS.md Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the category: tokenizers Tokenizer class or submodule update label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GHA CI based on Github actions no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Good First Issue]: Verify chatglm3-6b with GenAI text_generation
3 participants