Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc. bug: Unsupported op "CPY" / Segmentation fault on Metal #10976

Open
firelex opened this issue Dec 25, 2024 · 0 comments
Open

Misc. bug: Unsupported op "CPY" / Segmentation fault on Metal #10976

firelex opened this issue Dec 25, 2024 · 0 comments

Comments

@firelex
Copy link

firelex commented Dec 25, 2024

Name and Version

version: 4391 (9ba399d)
built with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin24.1.0

Operating systems

Mac (M4 Max / 128 GB)

Which llama.cpp modules do you know to be affected?

llama-server

Problem description & steps to reproduce

./build/bin/llama-server -m /Users/mattsinalco/.cache/huggingface/hub/models--unsloth--Llama-3.3-70B-Instruct-GGUF/snapshots/0c14ebbedd129fb190c8241facca9a360e81c650/Llama-3.3-70B-Instruct-Q4_K_M.gguf -md /Users/mattsinalco/.cache/huggingface/hub/models--unsloth--Llama-3.2-1B-Instruct-GGUF/snapshots/a5594fb18df5dfc6b43281423fcce6750cd92de5/Llama-3.2-1B-Instruct-Q4_K_M.gguf -ngl 99 -ngld 99 -fa --port 8034 --ctx-size 8192 --ctx-size-draft 8192 --draft-min 0 --draft-max 16 -np 7 --host 0.0.0.0 --slots --slot-save-path /Users/mattsinalco/mathias/caching -ctk q4_1 -ctv q4_1

Sometimes (reproducibly) gives me this:

/Users/mattsinalco/mathias/llama.cpp/ggml/src/ggml-metal/ggml-metal.m:1263: unsupported op
ggml_metal_encode_node: error: unsupported op 'CPY'

Other quantizations give me this:

zsh: segmentation fault ./build/bin/llama-server -m -md -ngl 99 -ngld 99 -fa --port 8034 --ctx-size

Related question - in the absence of quantization the KV cache workign reliabely, can I resize the KV cache size? I can't seem to load slots of 200MB (100MB is possible).

First Bad Commit

No response

Relevant log output

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant