You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
despite being hosted on HF, this model has no config.json and doesnt support inference with transformers library or any other library it seems, only their own custom mistral-inference runtime
Completed initial instruction eval at FP16, this is an excellent model at JavaScript especially. It used about 45GB of VRAM for inference during my testing runs so should work with 2x24GB setups.
This model also supports FIM, so will keep this issue open for that as well as any quants as they pop up.
Latest interview_cuda supports torchrun and mistral-inference runtime in an MVP capacity:
No description provided.
The text was updated successfully, but these errors were encountered: