LlamaCppGenerator randomness not working as expected #8676

erlebach · 2024-12-27T17:44:44Z

Consider the code below, which run a Llama-3.1 model with non-zero temperature. When I execute the code below multiple times, I always get the same resposne, even though Llama.cpp uses a non-deterministic seed by default. Is this expected behavior? What approach should I use to get a different result on every run? Setting seed=-1 in the generation_kwargs dictionary solves the problem. It is not clear why this is necessary though because seed=-1 by default in Llama.cpp (:#define LLAMA_DEFAULT_SEED 0xFFFFFFFF in spm-headers/llama.h in the https://github.com/ggerganov/llama.cpp.git repository. This suggest that there is an error somewhere.

      generation_kwargs={
          "max_tokens": 128,
          "temperature": 0.7,
          "top_k": 40,
          "top_p": 0.9,
          "seed": -1,
      },

from haystack_integrations.components.generators.llama_cpp import LlamaCppGenerator

# Set the seed to a random value based on the current time
random.seed(int(time.time()))

generator = LlamaCppGenerator(
    model="data/llm_models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf",
    n_ctx=512,
    n_batch=128,
    model_kwargs={
        "n_gpu_layers": -1,
        "verbose": False,
        "n_gpu_layers": -1,
    },
    generation_kwargs={
        "max_tokens": 128,
        "temperature": 1.7,
        "top_k": 40,
        "top_p": 0.9,
    },
)
generator.warm_up()

simplified_schema = '{"content": "Your single sentence answer here"}'
system =  "You are a helpful assistant. Respond to questions with a single sentence " \
          f"using clean JSON only following the JSON schema{simplified_schema}. " \
          " Never use markdown formatting or code block indicators."
user_query = "What is artificial intelligence?"

prompt = "<|begin_of_text|><|start_header_id|>system<|end_header_id|>" \
         f"{system}<|eot_id|>" \
         f"<|start_header_id|>user<|end_header_id|> {user_query}" \
         f"<|start_header_id|>assistant<|end_header_id|>"
print(f"{prompt=}")

result = generator.run(prompt)
print("result= ", result["replies"][0])

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LlamaCppGenerator randomness not working as expected #8676

LlamaCppGenerator randomness not working as expected #8676

erlebach commented Dec 27, 2024 •

edited

Loading

LlamaCppGenerator randomness not working as expected #8676

LlamaCppGenerator randomness not working as expected #8676

Comments

erlebach commented Dec 27, 2024 • edited Loading

erlebach commented Dec 27, 2024 •

edited

Loading