Are there plans to offer QAT recipes for other configurations? #1923

troy256 · 2024-10-30T11:15:15Z

troy256
Oct 30, 2024

I'd like to experiment with QAT. I see "tune ls" shows there is a QAT recipe available for the Llama3 model but only distributed and only for full fine tuning. Any chance of making additional recipes for Llama 3.1 or 3.2 on a single GPU?

pbontrager · 2024-11-18T16:40:27Z

pbontrager
Nov 18, 2024
Collaborator

To make QAT work with other models should be pretty straight forward, you just swap out the model and tokenizer part of the config to match the config options for those other models. We may offer a single device recipe for QAT in the future, but for now we haven't been prioritizing it. You can always run a distributed recipe on single device though by setting "--nproc_per_node 1" though.

0 replies

ebsmothers · 2024-12-09T17:00:40Z

ebsmothers
Dec 9, 2024
Collaborator

Just to follow up on this, while we don't have a separate recipe for single-device QAT, we do now provide a recipe for QAT + LoRA. This is similar to how the quantized 1B and 3B Llama models were trained, and you should be able to train using much less memory than the previous QAT full finetune recipe. You can see an example config here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are there plans to offer QAT recipes for other configurations? #1923

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Are there plans to offer QAT recipes for other configurations? #1923

troy256 Oct 30, 2024

Replies: 2 comments

pbontrager Nov 18, 2024 Collaborator

ebsmothers Dec 9, 2024 Collaborator

troy256
Oct 30, 2024

pbontrager
Nov 18, 2024
Collaborator

ebsmothers
Dec 9, 2024
Collaborator