Deployment of local MultiLoRA model using TGI #2564

ashwincv0112 · 2024-12-27T11:34:37Z

Hi Team,

Was trying to deploy a multi-lora adapter model with Starcoder2-3B as base.

Referring to the below blog:
https://huggingface.co/blog/multi-lora-serving

Please correct my understanding if I'm am wrong, that the Starcoder2 model is not supported for the multi-lora deployment using TGI. We are getting the below error while deploying.

AttributeError: 'TensorParallelColumnLinear' object has no attribute 'base_layer' rank=0

Also, can you suggest how we can deploy a local model and adapters saved in the local directory using TGI.
Every time I try running the below docker command, it is downloading the files from HF.

docker run --gpus all --shm-size 1g -p 8080:80 -v $PWD:/data \
    ghcr.io/huggingface/text-generation-inference:3.0.1 \
    --model-id bigcode/starcoder2-3b \
    --lora-adapters=<local_adapter_path>

Please let me know if any additional information is required.

Thanks,
Ashwin.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deployment of local MultiLoRA model using TGI #2564

Deployment of local MultiLoRA model using TGI #2564

ashwincv0112 commented Dec 27, 2024 •

edited

Loading

Deployment of local MultiLoRA model using TGI #2564

Deployment of local MultiLoRA model using TGI #2564

Comments

ashwincv0112 commented Dec 27, 2024 • edited Loading

ashwincv0112 commented Dec 27, 2024 •

edited

Loading