You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Openvino Server does not override the default max_context_length=512 with the one contained in the model.xml file
Step-by-step reproduction
Convert the embedding model to the OpenVINO IR format using the OpenVINO Toolkit and serve it with the openvino/model_server:latest-gpu Docker image.
When calling the http://localhost:8001/v3/embeddings API, the server throws the following error, even though the model has a max_position_embeddings value of 8194:
calculator_graph.cc:853] INVALID_ARGUMENT: CalculatorGraph::Run() failed in Run:
Calculator::Process() for node "EmbeddingsCalculator" failed: Input length 617 longer than allowed 512
Relevant log output
calculator_graph.cc:853] INVALID_ARGUMENT: CalculatorGraph::Run() failed in Run:
Calculator::Process() for node "EmbeddingsCalculator" failed: Input length 617 longer than allowed 512
Issue submission checklist
I'm reporting an issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.
The text was updated successfully, but these errors were encountered:
fedecompa
changed the title
[Bug]: Openvino Server does not ovverride the default max_context_length=512 with the one contained in the model.xml file
[Bug]: OpenVINO Server does not override the default max_context_length=512 with the value specified in the model.xml file
Dec 17, 2024
@fedecompa The windows version of the server was not retrieving correctly the information from the model about the context. It is now corrected on the main branch. Note that generative endpoint on windows server are now under development and testing. Production ready windows version of the server will be published in the coming release.
OpenVINO Version
2024.5
Operating System
Windows 11
Device used for inference
GPU
Framework
None
Model used
BAAI/bge-m3
Issue description
Openvino Server does not override the default max_context_length=512 with the one contained in the model.xml file
Step-by-step reproduction
Convert the embedding model to the OpenVINO IR format using the OpenVINO Toolkit and serve it with the openvino/model_server:latest-gpu Docker image.
When calling the http://localhost:8001/v3/embeddings API, the server throws the following error, even though the model has a max_position_embeddings value of 8194:
calculator_graph.cc:853] INVALID_ARGUMENT: CalculatorGraph::Run() failed in Run:
Calculator::Process() for node "EmbeddingsCalculator" failed: Input length 617 longer than allowed 512
Relevant log output
Issue submission checklist
The text was updated successfully, but these errors were encountered: