use last path component for unnamed ggufs #1281

kallewoof · 2024-12-24T04:23:57Z

Usually, a ggml-model-XXX.gguf file will reside in the directory named after the model, e.g. if the user generated the quant themselves and didn't move it. Instead of displaying models as koboldcpp/ggml-model-xxx, this picks the last directory component and tacks the quant info on it, but only if there is at least one file with the .safetensors extension in the same directory.

Usually, a ggml-model-XXX.gguf file will reside in the directory named after the model, e.g. if the user generated the quant themselves and didn't move it. Instead of displaying models as ggml-model-xxx.gguf, this picks the last directory component and tacks the quant info on it.

LostRuins · 2024-12-25T02:59:58Z

Hmm... I'm concerned about cases where the directory name is not the desired model name.

For example, people might simply download https://huggingface.co/nmerkle/Meta-Llama-3-8B-Instruct-ggml-model-Q4_K_M.gguf/tree/main to C:\Users\Bob\Desktop\ggml-model-Q4_K_M.gguf, and then the model name would be ... "Desktop".

It's also very possible that the model path doesn't contain the name directly. For example C:\Users\Bob\Desktop\Mistral-Airoboros-7B\ggufquant\ggml-model-Q4_K_M.gguf

or simply be equally unhelpful

C:\Users\Bob\Desktop\model\ggml-model-Q4_K_M.gguf

This approach also potentially exposes the directory structure of the host in unwanted ways.

kallewoof · 2024-12-25T04:35:57Z

Good point. Would it be acceptable if it was restricted to the case where there were .safetensors files in the same directory as the gguf file, perhaps?

Added that code and updated OP.

…s files in the same dir

LostRuins · 2024-12-25T05:17:03Z

I think probably still not a good idea. I myself have copied the files into random directories on my filepath when testing so it would definitely break for me too.

Anyway, this is already configurable as the horde model display name, it would be best for the user to set that themselves anyway, which they can. If that's done, it will override the display name, and if an API key is not set the worker won't start. So that can be used to rename displayed model names at will.

kallewoof · 2024-12-25T05:37:49Z

I think probably still not a good idea. I myself have copied the files into random directories on my filepath when testing so it would definitely break for me too.

Well, for this to take effect you would have to:

Copy the ggml-model-xxx.gguf file, as is, to some random directory unrelated to the model name (now you have no idea what model it is anymore).
Also copy .safetensors files into the same directory, again, without this directory being related to the model in question (i.e. you actively moved them from e.g. a huggingface transformers model directory into this random dir)
Serve this unnamed unknown model to the public.

I see your concern, but it seems quite unlikely that this would happen very frequently.

What if there was a flag that enabled this behavior?

Anyway, this is already configurable as the horde model display name, it would be best for the user to set that themselves anyway, which they can. If that's done, it will override the display name, and if an API key is not set the worker won't start. So that can be used to rename displayed model names at will.

This is mostly for the use case where you are jumping between models, and/or when you are quanting things yourself (I often download the HF model, then quant it myself without moving the resulting quant out of the dir). This is particularly tiresome for model makers, who might be quanting and testing models throughout training, and although e.g. Silly Tavern tracks the model name used, all you get to see is ggml-model for all of them, unless you specifically rename the file each time. So I might make a testx-checkpoint1234 dir with a model, quant it, boot it up and test it, and then later on, I have no idea what model this was.

I can see if such a niche use case would be low priority, though. It would be cool if other model makers chimed in on this one.

kallewoof · 2024-12-28T07:15:26Z

This is llama.cpp's llama-server, by the way. It prints the model name exactly as you entered it in the command line. Here I did

$ ./build/bin/llama-server -c 16384 -ngl 90 -m ../llm/Qwen2.5-32B-Instruct-Q8_0.gguf --host 0.0.0.0

A better example (which uses ggml-model) is this, freshly quanted from a model dir:

only use path component in friendlymodelname if there are .safetensor…

109e6ea

…s files in the same dir

LostRuins added the KIV for now Some issues prevent this from being merged label Dec 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use last path component for unnamed ggufs #1281

use last path component for unnamed ggufs #1281

kallewoof commented Dec 24, 2024 •

edited

Loading

LostRuins commented Dec 25, 2024

kallewoof commented Dec 25, 2024 •

edited

Loading

LostRuins commented Dec 25, 2024

kallewoof commented Dec 25, 2024 •

edited

Loading

kallewoof commented Dec 28, 2024 •

edited

Loading

use last path component for unnamed ggufs #1281

Are you sure you want to change the base?

use last path component for unnamed ggufs #1281

Conversation

kallewoof commented Dec 24, 2024 • edited Loading

LostRuins commented Dec 25, 2024

kallewoof commented Dec 25, 2024 • edited Loading

LostRuins commented Dec 25, 2024

kallewoof commented Dec 25, 2024 • edited Loading

kallewoof commented Dec 28, 2024 • edited Loading

kallewoof commented Dec 24, 2024 •

edited

Loading

kallewoof commented Dec 25, 2024 •

edited

Loading

kallewoof commented Dec 25, 2024 •

edited

Loading

kallewoof commented Dec 28, 2024 •

edited

Loading