Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: OpenVINO Server does not override the default max_context_length=512 with the value specified in the model.xml file #2923

Open
3 tasks done
fedecompa opened this issue Dec 17, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@fedecompa
Copy link

fedecompa commented Dec 17, 2024

OpenVINO Version

2024.5

Operating System

Windows 11

Device used for inference

GPU

Framework

None

Model used

BAAI/bge-m3

Issue description

Openvino Server does not override the default max_context_length=512 with the one contained in the model.xml file

Step-by-step reproduction

Convert the embedding model to the OpenVINO IR format using the OpenVINO Toolkit and serve it with the openvino/model_server:latest-gpu Docker image.

When calling the http://localhost:8001/v3/embeddings API, the server throws the following error, even though the model has a max_position_embeddings value of 8194:

calculator_graph.cc:853] INVALID_ARGUMENT: CalculatorGraph::Run() failed in Run:
Calculator::Process() for node "EmbeddingsCalculator" failed: Input length 617 longer than allowed 512

Relevant log output

calculator_graph.cc:853] INVALID_ARGUMENT: CalculatorGraph::Run() failed in Run: 
Calculator::Process() for node "EmbeddingsCalculator" failed: Input length 617 longer than allowed 512

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.
@fedecompa fedecompa added the bug Something isn't working label Dec 17, 2024
@fedecompa fedecompa changed the title [Bug]: Openvino Server does not ovverride the default max_context_length=512 with the one contained in the model.xml file [Bug]: OpenVINO Server does not override the default max_context_length=512 with the value specified in the model.xml file Dec 17, 2024
@ilya-lavrenov ilya-lavrenov transferred this issue from openvinotoolkit/openvino Dec 17, 2024
@dtrawins dtrawins self-assigned this Dec 18, 2024
@dtrawins
Copy link
Collaborator

@fedecompa The windows version of the server was not retrieving correctly the information from the model about the context. It is now corrected on the main branch. Note that generative endpoint on windows server are now under development and testing. Production ready windows version of the server will be published in the coming release.

@fedecompa
Copy link
Author

@dtrawins thanks for the reply, so to test it I need to wait for the openvino/model_server:2024.6-gpu right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants