Misc. bug: context shift results in error #10958

gompa · 2024-12-23T13:18:06Z

Name and Version

build/bin/./llama-server --version
version: 4384 (14b699e)
built with cc (Debian 14.2.0-11) 14.2.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Problem description & steps to reproduce

When running llama-server with the following command:
./build/bin/llama-server -fa -ctk q8_0 -ctv q8_0 -m ../models/phi-4-Q6_K.gguf --host 0.0.0.0 --port 8085
the same happens with llama3.2-3b so I don't think its model specific

sending a large request with chat history (full context length) crashes the server with :
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
New requests to the server are ignored
I think its related to the function : ggml_compute_forward_dup
the dst->type and src->type (8 vs 0 ) mismatch and there is no q* handler

First Bad Commit

No response

Relevant log output

request: POST /v1/chat/completions 192.168.1.59 200
slot launch_slot_: id  0 | task 613 | processing task
slot update_slots: id  0 | task 613 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 3817
slot update_slots: id  0 | task 613 | kv cache rm [3520, end)
slot update_slots: id  0 | task 613 | prompt processing progress, n_past = 3817, n_tokens = 297, progress = 0.077810
slot update_slots: id  0 | task 613 | prompt done, n_past = 3817, n_tokens = 297
slot update_slots: id  0 | task 613 | slot context shift, n_keep = 0, n_left = 4095, n_discard = 2047
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
fatal error
fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
fatal error
fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error
llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:3996: fatal error

The text was updated successfully, but these errors were encountered:

gompa added the bug-unconfirmed label Dec 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: context shift results in error #10958

Misc. bug: context shift results in error #10958

gompa commented Dec 23, 2024

Misc. bug: context shift results in error #10958

Misc. bug: context shift results in error #10958

Comments

gompa commented Dec 23, 2024

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Problem description & steps to reproduce

First Bad Commit

Relevant log output