Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用自带的聊天页面或者接入dify之后输出很慢 #2690

Open
1 of 3 tasks
congge27 opened this issue Dec 19, 2024 · 3 comments
Open
1 of 3 tasks

使用自带的聊天页面或者接入dify之后输出很慢 #2690

congge27 opened this issue Dec 19, 2024 · 3 comments
Labels
Milestone

Comments

@congge27
Copy link

System Info / 系統信息

Ubuntu 24.04
单卡4090

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

最新版

The command used to start Xinference / 用以启动 xinference 的命令

Reproduction / 复现过程

不知道有没有试过和ollama的对比,我测试了一下,同样的模型ollama的速度明显会快不少,不知道是不是我部署的问题。请问在性能方面有没有什么说明啥的

Expected behavior / 期待表现

请问有没有性能方面的说明啥的
比如测能测试

@XprobeBot XprobeBot added this to the v1.x milestone Dec 19, 2024
@congge27 congge27 changed the title 同样模型条件下和Ollama的性能对比 使用自带的聊天页面或者接入dify之后输出很慢 Dec 19, 2024
@qinxuye
Copy link
Contributor

qinxuye commented Dec 20, 2024

有部署到显卡上吗?

@congge27
Copy link
Author

有部署到显卡上吗?

有的,是用显卡跑的
直观现象是卡,我研究了下接口,
它的流式接口的输出是离散的,比如在16:02:29秒输出五段文字,下一次的输出就会在16:02:34了,中间会卡5秒左右

Copy link

This issue is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants