Misc. bug: Unsupported op "CPY" / Segmentation fault on Metal #10976

firelex · 2024-12-25T18:06:13Z

Name and Version

version: 4391 (9ba399d)
built with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin24.1.0

Operating systems

Mac (M4 Max / 128 GB)

Which llama.cpp modules do you know to be affected?

llama-server

Problem description & steps to reproduce

./build/bin/llama-server -m /Users/mattsinalco/.cache/huggingface/hub/models--unsloth--Llama-3.3-70B-Instruct-GGUF/snapshots/0c14ebbedd129fb190c8241facca9a360e81c650/Llama-3.3-70B-Instruct-Q4_K_M.gguf -md /Users/mattsinalco/.cache/huggingface/hub/models--unsloth--Llama-3.2-1B-Instruct-GGUF/snapshots/a5594fb18df5dfc6b43281423fcce6750cd92de5/Llama-3.2-1B-Instruct-Q4_K_M.gguf -ngl 99 -ngld 99 -fa --port 8034 --ctx-size 8192 --ctx-size-draft 8192 --draft-min 0 --draft-max 16 -np 7 --host 0.0.0.0 --slots --slot-save-path /Users/mattsinalco/mathias/caching -ctk q4_1 -ctv q4_1

Sometimes (reproducibly) gives me this:

/Users/mattsinalco/mathias/llama.cpp/ggml/src/ggml-metal/ggml-metal.m:1263: unsupported op
ggml_metal_encode_node: error: unsupported op 'CPY'

Other quantizations give me this:

zsh: segmentation fault ./build/bin/llama-server -m -md -ngl 99 -ngld 99 -fa --port 8034 --ctx-size

Related question - in the absence of quantization the KV cache workign reliabely, can I resize the KV cache size? I can't seem to load slots of 200MB (100MB is possible).

First Bad Commit

No response

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

firelex added the bug-unconfirmed label Dec 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: Unsupported op "CPY" / Segmentation fault on Metal #10976

Misc. bug: Unsupported op "CPY" / Segmentation fault on Metal #10976

firelex commented Dec 25, 2024 •

edited

Loading

Misc. bug: Unsupported op "CPY" / Segmentation fault on Metal #10976

Misc. bug: Unsupported op "CPY" / Segmentation fault on Metal #10976

Comments

firelex commented Dec 25, 2024 • edited Loading

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Problem description & steps to reproduce

First Bad Commit

Relevant log output

firelex commented Dec 25, 2024 •

edited

Loading