Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use stream function calling with a vllm hosted qwen2.5-7B model #12122

Open
5 tasks done
ddling opened this issue Dec 26, 2024 · 1 comment
Open
5 tasks done

Can't use stream function calling with a vllm hosted qwen2.5-7B model #12122

ddling opened this issue Dec 26, 2024 · 1 comment
Labels
🐞 bug Something isn't working

Comments

@ddling
Copy link

ddling commented Dec 26, 2024

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

dify cloud

Cloud or Self Hosted

Cloud

Steps to reproduce

Using vllm 0.6.0 to deploy a qwen2.5-7B model, using api can get the result, the parameters like this:

{
    "model": "Qwen2.5-7B",
    "messages": [
        {
            "role": "user",
            "content": "What is weather in Beijing?"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City name or geographic coordinates"
                        },
                        "unit": {
                            "type": "string",
                            "enum": [
                                "celsius",
                                "fahrenheit"
                            ],
                            "description": "Temperature unit"
                        }
                    },
                    "required": [
                        "location"
                    ]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "get_local_time",
                "description": "Get the current local time for a specific location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City name or geographic coordinates"
                        },
                        "format": {
                            "type": "string",
                            "enum": [
                                "12h",
                                "24h"
                            ],
                            "description": "Time format preference",
                            "default": "24h"
                        }
                    },
                    "required": [
                        "location"
                    ]
                }
            }
        }
    ],
    "tool_choice": "auto",
    "stream": true
}

The answer is

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"role":"assistant"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"id":"chatcmpl-tool-9724a19a62444aa2972b0afe379825fa","type":"function","index":0}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"name":"get_weather"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"location\": \""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"Be"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"ijing"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\", \"unit\": \""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"c"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"elsius"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\"}"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chat-ea479936ac72435a9a7969c32957a634","object":"chat.completion.chunk","created":1735195666,"model":"Qwen2.5-7B","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":"tool_calls","stop_reason":null}]}

data: [DONE]

But when I use this api in dify cloud, I got the error like this

image

image

✔️ Expected Behavior

image

Need to act like the not streaming function call api

❌ Actual Behavior

image

Copy link

dosubot bot commented Dec 26, 2024

@ddling Hello! I'm here to help you with any bugs, questions, or contributions. Let's get to the bottom of this together!

The issue you're encountering with the "list index out of range" error when using the stream function calling with a vllm hosted Qwen2.5-7B model in Dify cloud might be related to how the streaming responses are being handled. In Dify, the _invoke method manages both streaming and synchronous responses, and it uses a generator function to yield data chunks for streaming responses [1][2][3].

The error could be due to an unexpected data structure or a mismatch in the expected response format when streaming. Here are a few things you might consider checking or trying:

  1. Response Handling: Ensure that the client-side code handling the streaming response is correctly processing the chunks of data. Each chunk should be parsed and handled incrementally.

  2. Model Configuration: Verify that the model configuration, especially the stream parameter, is correctly set up in your Dify cloud environment. The stream parameter should be set to true to enable streaming.

  3. Error Handling: Review the error handling mechanism in your implementation. Dify's API converts exceptions into structured error responses, and a "list index out of range" error might indicate an issue with how the response data is being accessed or processed [4].

  4. Version Compatibility: Ensure that the version of vllm and the Qwen2.5-7B model you are using is compatible with the Dify cloud setup. There might be specific requirements or limitations with certain versions.

If these steps do not resolve the issue, further investigation into the specific implementation details and logs might be necessary to identify the root cause.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added the 🐞 bug Something isn't working label Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant