Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to use Phi-3.5-MoE-instruct #9168

Closed
4 tasks done
KarlHeinzMali opened this issue Aug 25, 2024 · 15 comments · May be fixed by #11003
Closed
4 tasks done

Request to use Phi-3.5-MoE-instruct #9168

KarlHeinzMali opened this issue Aug 25, 2024 · 15 comments · May be fixed by #11003
Labels
enhancement New feature or request stale

Comments

@KarlHeinzMali
Copy link

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

I like to use Phi-3.5-MoE-instruct, but it seems it is not be supported:

python convert_hf_to_gguf.py ~/.cache/huggingface/hub/models--microsoft--Phi-3.5-MoE-instruct/snapshots/482a9ba0eb0e1fa1671e3560e009d7cec2e5147c --outfile ../Phi-3.5-bf16.GGUF --outtype bf16
INFO:hf-to-gguf:Loading model: 482a9ba0eb0e1fa1671e3560e009d7cec2e5147c
ERROR:hf-to-gguf:Model PhiMoEForCausalLM is not supported

Motivation

Phi-3.5-MoE-instruct is a brand new advanced model.

Possible Implementation

No response

@KarlHeinzMali KarlHeinzMali added the enhancement New feature or request label Aug 25, 2024
@linkage001
Copy link

We already have #9119 open with the same topic. If you are on metal, MLX just merged support for it on mlx_llm.

@gardner
Copy link

gardner commented Aug 26, 2024

MLX just merged support for it on mlx_llm.

It looks like they just merged support for 3.5. Does that include MoE?
riccardomusmeci/mlx-llm@1b6fe71

@linkage001

@linkage001
Copy link

@gardner

MLX just merged support for it on mlx_llm.

It looks like they just merged support for 3.5. Does that include MoE? riccardomusmeci/mlx-llm@1b6fe71

@linkage001

@gardner yes, it includes MoE. It's running at 35tk/s on my M1 MacBook pro. ml-explore/mlx-examples#946

@KarlHeinzMali
Copy link
Author

I am using Ubuntu 23.10 with 2xEPYC 9654 12 channel DDR5.

@ayttop
Copy link

ayttop commented Aug 27, 2024

thank you but where gguf file?

@ayttop
Copy link

ayttop commented Aug 27, 2024

Screenshot 2024-08-27 162503

@PGTBoos
Copy link

PGTBoos commented Aug 28, 2024

thanks for the site but it fails to convert

@ayttop
Copy link

ayttop commented Aug 28, 2024

https://github.com/foldl/chatllm.cpp

| Supported Models | Download Quantized Models |

What's New:

2024-08-28: Phi-3.5 Mini & MoE

Inference of a bunch of models from less than 1B to more than 300B, for real-time chatting with RAG on your computer (CPU), pure C++ implementation based on @ggerganov's ggml.

| Supported Models | Download Quantized Models |

What's New:

2024-08-28: Phi-3.5 Mini & MoE

@ayttop
Copy link

ayttop commented Aug 28, 2024

not run on colab

@ThiloteE
Copy link

There is a PR in transformers. Maybe this is a requirement for llama.cpp to support the conversion? huggingface/transformers#33363.

@github-actions github-actions bot added the stale label Oct 11, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@nonetrix
Copy link

nonetrix commented Oct 25, 2024

plox reopen mr bot

@ThiloteE
Copy link

The PR in the transformers repo has been merged and is featured in release v4.46.0.

@ThiloteE
Copy link

Here is the documentation for how to add a new model to llama.cpp: https://github.com/ggerganov/llama.cpp/blob/master/docs/development/HOWTO-add-model.md

@nonetrix
Copy link

nonetrix commented Oct 25, 2024

I wish I knew enough about C++, maybe I try it for lols not that I'll get far 😅

I don't know anything about AI/ML either outside of basics lol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants