Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate mistralai/Codestral-22B-v0.1 (FIM and quants) #202

Open
the-crypt-keeper opened this issue May 29, 2024 · 2 comments
Open

Evaluate mistralai/Codestral-22B-v0.1 (FIM and quants) #202

the-crypt-keeper opened this issue May 29, 2024 · 2 comments
Labels
model request Evaluate performance of a new model

Comments

@the-crypt-keeper
Copy link
Owner

No description provided.

@the-crypt-keeper the-crypt-keeper added the model request Evaluate performance of a new model label May 29, 2024
@the-crypt-keeper
Copy link
Owner Author

despite being hosted on HF, this model has no config.json and doesnt support inference with transformers library or any other library it seems, only their own custom mistral-inference runtime

@the-crypt-keeper
Copy link
Owner Author

the-crypt-keeper commented May 29, 2024

Completed initial instruction eval at FP16, this is an excellent model at JavaScript especially. It used about 45GB of VRAM for inference during my testing runs so should work with 2x24GB setups.

This model also supports FIM, so will keep this issue open for that as well as any quants as they pop up.

Latest interview_cuda supports torchrun and mistral-inference runtime in an MVP capacity:

torchrun --nproc-per-node 4 ./interview_cuda.py --runtime mistral --model_name ~/models/codestral-22B-v0.1 --params params/greedy-hf.json --input results/prepare_senior_python-javascript_chat-simple.ndjson,results/prepare_junior-v2_python-javascript_chat-simple.ndjson

Adjust 4 to the number of GPU, and --model_name in this case is a directory path and not an HF path

@the-crypt-keeper the-crypt-keeper changed the title Evaluate mistralai/Codestral-22B-v0.1 Evaluate mistralai/Codestral-22B-v0.1 (FIM and quants) May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model request Evaluate performance of a new model
Projects
None yet
Development

No branches or pull requests

1 participant