You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Consider the code below, which run a Llama-3.1 model with non-zero temperature. When I execute the code below multiple times, I always get the same resposne, even though Llama.cpp uses a non-deterministic seed by default. Is this expected behavior? What approach should I use to get a different result on every run? Setting seed=-1 in the generation_kwargs dictionary solves the problem. It is not clear why this is necessary though because seed=-1 by default in Llama.cpp (:#define LLAMA_DEFAULT_SEED 0xFFFFFFFF in spm-headers/llama.h in the https://github.com/ggerganov/llama.cpp.git repository. This suggest that there is an error somewhere.
from haystack_integrations.components.generators.llama_cpp import LlamaCppGenerator
# Set the seed to a random value based on the current time
random.seed(int(time.time()))
generator = LlamaCppGenerator(
model="data/llm_models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf",
n_ctx=512,
n_batch=128,
model_kwargs={
"n_gpu_layers": -1,
"verbose": False,
"n_gpu_layers": -1,
},
generation_kwargs={
"max_tokens": 128,
"temperature": 1.7,
"top_k": 40,
"top_p": 0.9,
},
)
generator.warm_up()
simplified_schema = '{"content": "Your single sentence answer here"}'
system = "You are a helpful assistant. Respond to questions with a single sentence " \
f"using clean JSON only following the JSON schema{simplified_schema}. " \
" Never use markdown formatting or code block indicators."
user_query = "What is artificial intelligence?"
prompt = "<|begin_of_text|><|start_header_id|>system<|end_header_id|>" \
f"{system}<|eot_id|>" \
f"<|start_header_id|>user<|end_header_id|> {user_query}" \
f"<|start_header_id|>assistant<|end_header_id|>"
print(f"{prompt=}")
result = generator.run(prompt)
print("result= ", result["replies"][0])
The text was updated successfully, but these errors were encountered:
Consider the code below, which run a Llama-3.1 model with non-zero temperature. When I execute the code below multiple times, I always get the same resposne, even though Llama.cpp uses a non-deterministic seed by default. Is this expected behavior? What approach should I use to get a different result on every run? Setting
seed=-1
in thegeneration_kwargs
dictionary solves the problem. It is not clear why this is necessary though becauseseed=-1
by default in Llama.cpp (:#define LLAMA_DEFAULT_SEED 0xFFFFFFFF inspm-headers/llama.h
in thehttps://github.com/ggerganov/llama.cpp.git
repository. This suggest that there is an error somewhere.The text was updated successfully, but these errors were encountered: