Request to use Phi-3.5-MoE-instruct #9168

KarlHeinzMali · 2024-08-25T14:55:32Z

Prerequisites

I am running the latest code. Mention the version if possible as well.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

I like to use Phi-3.5-MoE-instruct, but it seems it is not be supported:

python convert_hf_to_gguf.py ~/.cache/huggingface/hub/models--microsoft--Phi-3.5-MoE-instruct/snapshots/482a9ba0eb0e1fa1671e3560e009d7cec2e5147c --outfile ../Phi-3.5-bf16.GGUF --outtype bf16
INFO:hf-to-gguf:Loading model: 482a9ba0eb0e1fa1671e3560e009d7cec2e5147c
ERROR:hf-to-gguf:Model PhiMoEForCausalLM is not supported

Motivation

Phi-3.5-MoE-instruct is a brand new advanced model.

Possible Implementation

No response

linkage001 · 2024-08-25T16:17:07Z

We already have #9119 open with the same topic. If you are on metal, MLX just merged support for it on mlx_llm.

gardner · 2024-08-26T02:03:16Z

MLX just merged support for it on mlx_llm.

It looks like they just merged support for 3.5. Does that include MoE?
riccardomusmeci/mlx-llm@1b6fe71

@linkage001

linkage001 · 2024-08-26T09:06:04Z

@gardner

MLX just merged support for it on mlx_llm.

It looks like they just merged support for 3.5. Does that include MoE? riccardomusmeci/mlx-llm@1b6fe71

@linkage001

@gardner yes, it includes MoE. It's running at 35tk/s on my M1 MacBook pro. ml-explore/mlx-examples#946

KarlHeinzMali · 2024-08-26T11:10:36Z

I am using Ubuntu 23.10 with 2xEPYC 9654 12 channel DDR5.

ayttop · 2024-08-27T23:21:30Z

thank you but where gguf file?

ayttop · 2024-08-27T23:26:02Z

PGTBoos · 2024-08-28T12:17:00Z

thanks for the site but it fails to convert

ayttop · 2024-08-28T21:06:14Z

https://github.com/foldl/chatllm.cpp

| Supported Models | Download Quantized Models |

What's New:

2024-08-28: Phi-3.5 Mini & MoE

Inference of a bunch of models from less than 1B to more than 300B, for real-time chatting with RAG on your computer (CPU), pure C++ implementation based on @ggerganov's ggml.

| Supported Models | Download Quantized Models |

What's New:

2024-08-28: Phi-3.5 Mini & MoE

ayttop · 2024-08-28T23:17:08Z

not run on colab

ThiloteE · 2024-09-10T22:08:16Z

There is a PR in transformers. Maybe this is a requirement for llama.cpp to support the conversion? huggingface/transformers#33363.

github-actions · 2024-10-25T01:28:12Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

nonetrix · 2024-10-25T01:57:31Z

plox reopen mr bot

ThiloteE · 2024-10-25T07:18:13Z

The PR in the transformers repo has been merged and is featured in release v4.46.0.

ThiloteE · 2024-10-25T07:30:17Z

Here is the documentation for how to add a new model to llama.cpp: https://github.com/ggerganov/llama.cpp/blob/master/docs/development/HOWTO-add-model.md

nonetrix · 2024-10-25T07:36:45Z

I wish I knew enough about C++, maybe I try it for lols not that I'll get far 😅

I don't know anything about AI/ML either outside of basics lol

KarlHeinzMali added the enhancement New feature or request label Aug 25, 2024

github-actions bot added the stale label Oct 11, 2024

github-actions bot closed this as completed Oct 25, 2024

phymbert mentioned this issue Dec 28, 2024

model: Add support for PhiMoE arch #11003

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request to use Phi-3.5-MoE-instruct #9168

Request to use Phi-3.5-MoE-instruct #9168

KarlHeinzMali commented Aug 25, 2024

linkage001 commented Aug 25, 2024

gardner commented Aug 26, 2024

linkage001 commented Aug 26, 2024

KarlHeinzMali commented Aug 26, 2024

ayttop commented Aug 27, 2024

ayttop commented Aug 27, 2024

PGTBoos commented Aug 28, 2024

ayttop commented Aug 28, 2024

ayttop commented Aug 28, 2024

ThiloteE commented Sep 10, 2024

github-actions bot commented Oct 25, 2024

nonetrix commented Oct 25, 2024 •

edited

Loading

ThiloteE commented Oct 25, 2024

ThiloteE commented Oct 25, 2024

nonetrix commented Oct 25, 2024 •

edited

Loading

Request to use Phi-3.5-MoE-instruct #9168

Request to use Phi-3.5-MoE-instruct #9168

Comments

KarlHeinzMali commented Aug 25, 2024

Prerequisites

Feature Description

Motivation

Possible Implementation

linkage001 commented Aug 25, 2024

gardner commented Aug 26, 2024

linkage001 commented Aug 26, 2024

KarlHeinzMali commented Aug 26, 2024

ayttop commented Aug 27, 2024

ayttop commented Aug 27, 2024

PGTBoos commented Aug 28, 2024

ayttop commented Aug 28, 2024

ayttop commented Aug 28, 2024

ThiloteE commented Sep 10, 2024

github-actions bot commented Oct 25, 2024

nonetrix commented Oct 25, 2024 • edited Loading

ThiloteE commented Oct 25, 2024

ThiloteE commented Oct 25, 2024

nonetrix commented Oct 25, 2024 • edited Loading

nonetrix commented Oct 25, 2024 •

edited

Loading

nonetrix commented Oct 25, 2024 •

edited

Loading