Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]OpenAI the chatClient.CompleteChatAsync(..) Method After triggering the Token limit of the TPM, the system is kept waiting. #47640

Open
qideqian opened this issue Dec 23, 2024 · 8 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team OpenAI question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@qideqian
Copy link

Library name and version

Azure.AI.OpenAI 2.1.0

Describe the bug

 ClientResult<ChatCompletion> completion = await chatClient.CompleteChatAsync( chatMessages , new ChatCompletionOptions() { }, cancellationTokenSource.Token);

Will be in the waiting, do not throw exceptions, get no result.

Expected behavior

A message is displayed indicating that the exception that triggers the TPM is not kept waiting, the exception is not thrown, and no result is displayed.

Actual behavior

Has been waiting, no exceptions, no results.
API results for Microsoft Case

{

  "error": {

    "code": "429",

    "message": "Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 49 seconds. Please go here: [https://aka.ms/oai/quotaincrease](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2Foai%2Fquotaincrease&data=05%7C02%7Csupportmail3%40microsoft.com%7C779941d5de1543bb69c908dd1db48cf3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638699383566043223%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=knjGHBsdrbANiHIulxbSa4nqRfZfCgYBJlQIq2AcZXQ%3D&reserved=0) if you would like to further increase the default rate limit."

  }

}

Reproduction Steps

ClientResult completion = await chatClient.CompleteChatAsync(
chatMessages
, new ChatCompletionOptions()
{
}, cancellationTokenSource.Token);

Environment

No response

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team OpenAI question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. labels Dec 23, 2024
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @ralph-msft @trrwilson.

@ArthurMa1978
Copy link
Member

Thanks for your feedback @qideqian , @chunyu3 please look into this issue.

@chunyu3
Copy link
Member

chunyu3 commented Dec 25, 2024

@qideqian when trigger the Token limit, the service will return 429 response, .NET will regard the 429 as retriable response, and will retry to send the request. default retry 3 times.

Would you please help me to check it is because the re-try cause the delay and hang?
we can skip retry by provide an retry policy in clientOptions as following, and see if the request return as expected. Thanks

AzureOpenAIClient azureClient = new(
    new Uri("<endpoint>"),
    credential,
    new AzureOpenAIClientOptions()
    {
        RetryPolicy = new ClientRetryPolicy(0) // retry 0 
    });

@qideqian
Copy link
Author

qideqian commented Dec 25, 2024

@chunyu3 Yes, the exception of 429 can be directly captured after adding RetryPolicy = new ClientRetryPolicy(0) // retry 0 Meanwhile, I have set the expired task of 90s. If retryPolicy=0 is not set, the timeout task of 90s will be triggered in priority. The client will not get any results. What is the current 3-try policy? The total time should not exceed 90s, is there any problem in setting here?

@chunyu3
Copy link
Member

chunyu3 commented Dec 25, 2024

@qideqian you can set the timeout when create client . which will set the timeout for each round of http request.

AzureOpenAIClient azureClient = new(
    new Uri("<endpoint>"),
    credential,
    new AzureOpenAIClientOptions()
    {
        NetworkTimeout = TimeSpan.FromMilliseconds(90);
    });

@qideqian
Copy link
Author

@chunyu3 Ok, thank you. Later, I will try to use the way you mentioned to set timeout. For the problem of waiting all the time, if there is a retry policy, can an exception be thrown after multiple failures or there is a default expiration time? Otherwise, it may not be easy to find out what the problem is.

@chunyu3
Copy link
Member

chunyu3 commented Dec 25, 2024

@qideqian default retry policy: it will retry pre-define times (3 by default) and there will be an delay between each retry. There is no expiration time set for retry policy.

@qideqian
Copy link
Author

@chunyu3 Okay, so the current performance of not increasing the number of tests is to wait and see what causes it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team OpenAI question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

3 participants