-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Summarize fails even when a model response is generated with the error "HTTP request failed: POST predict: Post "http://127.0.0.1:36333/completion": EOF" #2131
Comments
adding my changes |
so, the summarizeservice is retrieved from the request state using SummarizeService = request.state.injector.get(SummarizeService) adn for the streaming Response uses to_openai_stream to convert the response to a SSE stream. the issue is somewhere here |
also tested to increase the timeout value here which did not have any effect.. so
PART OF THE MODEL RESPONSE:... 11:37:04.888 [INFO ] httpx - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 500 Internal Server Error" |
Can you check ollama server logs? The problem is ollama related to, since it's throwing 500. Something with context window, computer resources, etc |
Pre-check
Description
while using summarize i keep getting the below error. I had to fix the summarize_service.py and ui.py to catch it in a good way.
here is the returned error
10:30:57.046 [ERROR ] private_gpt.server.recipes.summarize.summarize_service - HTTP request failed: POST predict: Post "http://127.0.0.1:36333/completion": EOF
It looked to be a backend issue but we can clearly see that the response is created correctly by the Ollama backend
Given the information from multiple sources and not prior knowledge, answer the query.
Query: Provide a comprehensive summary of the provided context information. The summary should cover all the key points and main ideas presented in the original text, while also condensing the information into a concise and easy-to-understand format. Please ensure that the summary includes relevant details and examples that support the main ideas, while avoiding any unnecessary information or repetition.
Answer:
** Response: **
assistant: The provided context information outlines various aspects of user management, software usage tracking, and computer inventory in an IT system. Here's a comprehensive summary:
now, i had to fix a lot in the code. fixed all of these:
Summary of the Main Problem
The main problem involved handling asynchronous operations correctly in the summarization service and the UI. Specifically, the issues were:
nest_asyncio
to allow nested event loops._summarize
method needed to be converted to an async generator to handle streaming responses correctly. For that thestream_summarize
andsummarize
methods needed to use the async generator correctlyasyncio.CancelledError
,ResponseError
, andStopAsyncIteration
._chat
method in the UI needed to handle async streaming responses correctly and making sure thatstream_summarize
was used correctly in anasync for
loop in the UI.As i see it the error we have here is due to the following since i have ruled out the server side and networking issues.
since we can clearly see that a response is generated my the model
Steps to Reproduce
Expected Behavior
No errors and correct summarization
Actual Behavior
Summarize fails even a model response is generated with the error "HTTP request failed: POST predict: Post "http://127.0.0.1:36333/completion": EOF"
Environment
CUDA12, Ubuntu, Ollama profile
Additional Information
No response
Version
No response
Setup Checklist
NVIDIA GPU Setup Checklist
nvidia-smi
to verify).sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
)The text was updated successfully, but these errors were encountered: