Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sequence length axis in tensor trim #723

Conversation

as-suvorov
Copy link
Contributor

@as-suvorov as-suvorov commented Aug 1, 2024

  • Use sequence length axis in trimm_tensor

@as-suvorov as-suvorov added bug Something isn't working enhancement New feature or request labels Aug 1, 2024
Copy link
Contributor

@ilya-lavrenov ilya-lavrenov Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add such models with different sequence length dimension ID to GHA CI validation ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we keep stateful mode after CB merge? If yes, then I think it make sense to add tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, CB is already merged and we keep stateful mode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests can be added separately

@ilya-lavrenov ilya-lavrenov self-requested a review August 1, 2024 17:21
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Aug 2, 2024
Comment on lines 37 to 40
const std::map<std::string, size_t> model_type_to_seq_len_axis{
{"chatglm", 0},
{"llama", 2},
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilya-lavrenov , @Wovchena , This approach is not reliable, chatglm3 and chatglm2 have seq_len_axis=0 but glm-4-9b-chat has seq_len_axis=2. All of them have same model_type='chatglm'.
I'll try to detect seq_len_axis by comparing kv tensor shape dimensions with generated tokens length.

@as-suvorov as-suvorov removed this pull request from the merge queue due to a manual request Aug 2, 2024
@as-suvorov
Copy link
Contributor Author

Removed from merge queue due to: #723 (comment)

@as-suvorov as-suvorov marked this pull request as draft August 2, 2024 11:55
@as-suvorov as-suvorov marked this pull request as ready for review August 2, 2024 16:01
@as-suvorov as-suvorov changed the title Set sequence length axis based on model type Use sequence length axis in tensor trim Aug 5, 2024
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Aug 5, 2024
Merged via the queue into openvinotoolkit:master with commit 4e1e755 Aug 6, 2024
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants