Skip to content

Commit

Permalink
Change post training run.yaml inference config (#710)
Browse files Browse the repository at this point in the history
## Context
Colab notebook provides some limited free T4 GPU. 

Making post training template e2e works with colab notebook T4 is
critical for early adoption of the stack post training apis. However, we
found that the existing LlamaModelParallelGenerator
(https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/inline/inference/meta_reference/inference.py#L82)
in meta-reference inference implementation isn't compatible with T4
machine.

In this PR, We change to disable create_distributed_process_group for
inference api in post training run.yaml config and setup up the
distributed env variables in notebook
<img width="493" alt="Screenshot 2025-01-02 at 3 48 08 PM"
src="https://github.com/user-attachments/assets/dd159f70-4cff-475c-b459-1fc6e2c720ba"
/>

to make meta reference inference compatible with the free T4 machine

 ## test
Test with the WIP post training showcase colab notebook
https://colab.research.google.com/drive/1K4Q2wZq232_Bpy2ud4zL9aRxvCWAwyQs?usp=sharing
  • Loading branch information
SLR722 authored Jan 3, 2025
1 parent e1f42eb commit f450a0f
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions llama_stack/templates/experimental-post-training/run.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ providers:
config:
max_seq_len: 4096
checkpoint_dir: null
create_distributed_process_group: False
eval:
- provider_id: meta-reference
provider_type: inline::meta-reference
Expand Down

0 comments on commit f450a0f

Please sign in to comment.