We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ubuntu 22.04.5 LTS
xprobe/xinference:v1.0.1 sha256:5934a612a67108569a576ec1546e1d6ad17e510bd5624b95eaebb981400fd12f
docker run -e XINFERENCE_MODEL_SRC=huggingface -e HF_ENDPOINT=https://hf-mirror.com -p 9997:9997 --gpus all xprobe/xinference:v1.0.1 xinference-local --host 0.0.0.0 --port 9997
allocate_devices_with_gpu_idx
supervisor
get_model
正常启动模型实例
The text was updated successfully, but these errors were encountered:
gpu idx 和 replica 配合可能支持的不太好,如果有修复也可以提交 PR。
Sorry, something went wrong.
This issue is stale because it has been open for 7 days with no activity.
No branches or pull requests
System Info / 系統信息
Ubuntu 22.04.5 LTS
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
xprobe/xinference:v1.0.1
sha256:5934a612a67108569a576ec1546e1d6ad17e510bd5624b95eaebb981400fd12f
The command used to start Xinference / 用以启动 xinference 的命令
docker run -e XINFERENCE_MODEL_SRC=huggingface -e HF_ENDPOINT=https://hf-mirror.com -p 9997:9997 --gpus all xprobe/xinference:v1.0.1 xinference-local --host 0.0.0.0 --port 9997
Reproduction / 复现过程
allocate_devices_with_gpu_idx
(xinference/core/worker.py:L466) 判断所选GPU设备上是否部署有vllm模型时调用了supervisor
的get_model
函数导致Expected behavior / 期待表现
正常启动模型实例
The text was updated successfully, but these errors were encountered: