Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Homogeneize generation params #428

Merged
merged 44 commits into from
Jan 2, 2025
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
1ff5a78
adding input generation config
clefourrier Dec 9, 2024
12c6a90
added tgi model
clefourrier Dec 9, 2024
c9657d2
grammar is task dependant, removed from the cofnig
clefourrier Dec 9, 2024
ac6565a
added openai config + moved everything to dict
clefourrier Dec 9, 2024
2628571
added generation configs to models
clefourrier Dec 9, 2024
c24bf9b
added generation configs to models
clefourrier Dec 9, 2024
0aa2e19
fix
clefourrier Dec 9, 2024
e3311bd
fix
clefourrier Dec 9, 2024
a3f535f
added doc
clefourrier Dec 9, 2024
286668f
Saved GenerationParameter class in model config classes, then saved i…
clefourrier Dec 10, 2024
0b2475a
changed model args
clefourrier Dec 10, 2024
521559f
test
clefourrier Dec 10, 2024
91363fe
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Dec 11, 2024
c088ab6
updated launchers
clefourrier Dec 11, 2024
e1bd34f
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Dec 12, 2024
3eb7d0f
rename base_model to transformers_model
clefourrier Dec 12, 2024
a585701
removed the use of a GenerationConfig object, as it's got lots of par…
clefourrier Dec 12, 2024
f9ab29b
revert
clefourrier Dec 12, 2024
4833929
fix docs
clefourrier Dec 12, 2024
30bed89
fix #16 by also allowing a generationconfig object to be passed progr…
clefourrier Dec 12, 2024
431b4f2
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Dec 12, 2024
fb4ecdc
Apply suggestions from code review
clefourrier Dec 16, 2024
be99c5e
Update src/lighteval/models/transformers/transformers_model.py
clefourrier Dec 16, 2024
e8b9057
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Dec 16, 2024
dece2f9
removed temperature from default vllm params as it should be passed v…
clefourrier Dec 17, 2024
8e3b7e2
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Dec 17, 2024
5c89fe2
Update src/lighteval/models/transformers/transformers_model.py
clefourrier Dec 18, 2024
6a18b81
logging fix
clefourrier Dec 18, 2024
83cbb10
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Dec 18, 2024
c6f42ca
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Dec 18, 2024
90593a9
added default gen params
clefourrier Dec 18, 2024
ff5026b
Apply suggestions from code review
clefourrier Dec 26, 2024
87d052c
rename file
clefourrier Dec 26, 2024
3f96b95
added from path to openai model
clefourrier Dec 26, 2024
843b572
style
clefourrier Dec 26, 2024
e233190
Update src/lighteval/models/transformers/transformers_model.py
clefourrier Dec 26, 2024
97db620
inferenceendpoint renamed to ie
clefourrier Dec 26, 2024
e2d512b
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Dec 26, 2024
e636f73
style 2
clefourrier Dec 26, 2024
ded4cf0
fix vllm
clefourrier Dec 26, 2024
fddfa6f
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Jan 2, 2025
c0566ee
Merge branch 'main' into clem_homogeneize_generation_params
NathanHB Jan 2, 2025
7a54afa
restore line
clefourrier Jan 2, 2025
4319230
Merge branch 'main' into clem_homogeneize_generation_params
clefourrier Jan 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/source/package_reference/models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@


## Accelerate and Transformers Models
### BaseModel
[[autodoc]] models.transformers.base_model.BaseModelConfig
[[autodoc]] models.transformers.base_model.BaseModel
### TransformersModel
[[autodoc]] models.transformers.transformers_model.TransformersModelConfig
[[autodoc]] models.transformers.transformers_model.TransformersModel

### AdapterModel
[[autodoc]] models.transformers.adapter_model.AdapterModelConfig
Expand Down
3 changes: 2 additions & 1 deletion examples/model_configs/base_model.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
model:
base_params:
model_args: "pretrained=HuggingFaceH4/zephyr-7b-beta,revision=main" # pretrained=model_name,trust_remote_code=boolean,revision=revision_to_use,model_parallel=True ...
model_args: "pretrained=HuggingFaceTB/SmolLM-1.7B,revision=main" # pretrained=model_name,trust_remote_code=boolean,revision=revision_to_use,model_parallel=True ...
dtype: "bfloat16"
compile: true
merged_weights: # Ignore this section if you are not using PEFT models
Expand All @@ -9,3 +9,4 @@ model:
base_model: null # path to the base_model
generation:
multichoice_continuations_start_space: null # If true/false, will force multiple choice continuations to start/not start with a space. If none, will do nothing
temperature: 0.5
11 changes: 7 additions & 4 deletions src/lighteval/main_accelerate.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def accelerate( # noqa C901
model_args: Annotated[
str,
Argument(
help="Model arguments in the form key1=value1,key2=value2,... or path to yaml config file (see examples/model_configs/base_model.yaml)"
help="Model arguments in the form key1=value1,key2=value2,... or path to yaml config file (see examples/model_configs/transformers_model.yaml)"
clefourrier marked this conversation as resolved.
Show resolved Hide resolved
),
],
tasks: Annotated[str, Argument(help="Comma-separated list of tasks to evaluate on.")],
Expand Down Expand Up @@ -107,9 +107,10 @@ def accelerate( # noqa C901
from accelerate import Accelerator, InitProcessGroupKwargs

from lighteval.logging.evaluation_tracker import EvaluationTracker
from lighteval.models.model_input import GenerationParameters
from lighteval.models.transformers.adapter_model import AdapterModelConfig
from lighteval.models.transformers.base_model import BaseModelConfig, BitsAndBytesConfig
from lighteval.models.transformers.delta_model import DeltaModelConfig
from lighteval.models.transformers.transformers_model import BitsAndBytesConfig, TransformersModelConfig
from lighteval.pipeline import EnvConfig, ParallelismManager, Pipeline, PipelineParameters

accelerator = Accelerator(kwargs_handlers=[InitProcessGroupKwargs(timeout=timedelta(seconds=3000))])
Expand Down Expand Up @@ -154,6 +155,8 @@ def accelerate( # noqa C901
# We extract the model args
args_dict = {k.split("=")[0]: k.split("=")[1] for k in config["base_params"]["model_args"].split(",")}

args_dict["generation_parameters"] = GenerationParameters.from_dict(config)

# We store the relevant other args
args_dict["base_model"] = config["merged_weights"]["base_model"]
args_dict["compile"] = bool(config["base_params"]["compile"])
Expand All @@ -180,13 +183,13 @@ def accelerate( # noqa C901
elif config["merged_weights"]["base_model"] not in ["", None]:
raise ValueError("You can't specify a base model if you are not using delta/adapter weights")
else:
model_config = BaseModelConfig(**args_dict)
model_config = TransformersModelConfig(**args_dict)
else:
model_args_dict: dict = {k.split("=")[0]: k.split("=")[1] if "=" in k else True for k in model_args.split(",")}
model_args_dict["accelerator"] = accelerator
model_args_dict["use_chat_template"] = use_chat_template
model_args_dict["compile"] = bool(model_args_dict["compile"]) if "compile" in model_args_dict else False
model_config = BaseModelConfig(**model_args_dict)
model_config = TransformersModelConfig(**model_args_dict)

pipeline = Pipeline(
tasks=tasks,
Expand Down
31 changes: 23 additions & 8 deletions src/lighteval/main_endpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
from typing import Optional

import typer
import yaml
from typer import Argument, Option
from typing_extensions import Annotated

Expand All @@ -42,8 +43,11 @@
@app.command(rich_help_panel="Evaluation Backends")
def openai(
# === general ===
model_name: Annotated[
str, Argument(help="The model name to evaluate (has to be available through the openai API.")
model_args: Annotated[
str,
Argument(
help="Model name as a string (has to be available through the openai API) or path to yaml config file (see examples/model_configs/transformers_model.yaml)"
),
],
tasks: Annotated[str, Argument(help="Comma-separated list of tasks to evaluate on.")],
# === Common parameters ===
Expand Down Expand Up @@ -93,9 +97,20 @@ def openai(
Evaluate OPENAI models.
"""
from lighteval.logging.evaluation_tracker import EvaluationTracker
from lighteval.models.model_config import OpenAIModelConfig

# from lighteval.models.model_input import GenerationParameters
clefourrier marked this conversation as resolved.
Show resolved Hide resolved
from lighteval.models.endpoints.openai_model import OpenAIModelConfig
from lighteval.models.model_input import GenerationParameters
from lighteval.pipeline import EnvConfig, ParallelismManager, Pipeline, PipelineParameters

if model_args.endswith(".yaml"):
with open(model_args, "r") as f:
config = yaml.safe_load(f)["model"]
generation_parameters = GenerationParameters.from_dict(config)
clefourrier marked this conversation as resolved.
Show resolved Hide resolved
model_config = OpenAIModelConfig(model=config["model_name"], generation_parameters=generation_parameters)
else:
model_config = OpenAIModelConfig(model=model_args)

env_config = EnvConfig(token=TOKEN, cache_dir=cache_dir)
evaluation_tracker = EvaluationTracker(
output_dir=output_dir,
Expand All @@ -107,7 +122,6 @@ def openai(
)

parallelism_manager = ParallelismManager.OPENAI
model_config = OpenAIModelConfig(model=model_name)

pipeline_params = PipelineParameters(
launcher_type=parallelism_manager,
Expand Down Expand Up @@ -198,7 +212,6 @@ def inference_endpoint(
"""
Evaluate models using inference-endpoints as backend.
"""

from lighteval.logging.evaluation_tracker import EvaluationTracker
from lighteval.models.endpoints.endpoint_model import (
InferenceEndpointModelConfig,
Expand Down Expand Up @@ -314,10 +327,9 @@ def tgi(
"""
Evaluate models using TGI as backend.
"""
import yaml

from lighteval.logging.evaluation_tracker import EvaluationTracker
from lighteval.models.model_config import TGIModelConfig
from lighteval.models.endpoints.tgi_model import TGIModelConfig
from lighteval.models.model_input import GenerationParameters
from lighteval.pipeline import EnvConfig, ParallelismManager, Pipeline, PipelineParameters

env_config = EnvConfig(token=TOKEN, cache_dir=cache_dir)
Expand All @@ -335,10 +347,13 @@ def tgi(
with open(model_config_path, "r") as f:
config = yaml.safe_load(f)["model"]

generation_parameters = GenerationParameters.from_dict(config)

model_config = TGIModelConfig(
inference_server_address=config["instance"]["inference_server_address"],
inference_server_auth=config["instance"]["inference_server_auth"],
model_id=config["instance"]["model_id"],
generation_parameters=generation_parameters,
)

pipeline_params = PipelineParameters(
Expand Down
21 changes: 18 additions & 3 deletions src/lighteval/main_vllm.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,12 @@

def vllm(
# === general ===
model_args: Annotated[str, Argument(help="Model arguments in the form key1=value1,key2=value2,...")],
model_args: Annotated[
str,
Argument(
help="Model arguments in the form key1=value1,key2=value2,... or path to yaml config file (see examples/model_configs/transformers_model.yaml)"
),
],
tasks: Annotated[str, Argument(help="Comma-separated list of tasks to evaluate on.")],
# === Common parameters ===
use_chat_template: Annotated[
Expand Down Expand Up @@ -88,7 +93,10 @@ def vllm(
"""
Evaluate models using vllm as backend.
"""
import yaml

from lighteval.logging.evaluation_tracker import EvaluationTracker
from lighteval.models.model_input import GenerationParameters
from lighteval.models.vllm.vllm_model import VLLMModelConfig
from lighteval.pipeline import EnvConfig, ParallelismManager, Pipeline, PipelineParameters

Expand Down Expand Up @@ -118,8 +126,15 @@ def vllm(
system_prompt=system_prompt,
)

model_args_dict: dict = {k.split("=")[0]: k.split("=")[1] if "=" in k else True for k in model_args.split(",")}
model_config = VLLMModelConfig(**model_args_dict)
if model_args.endswith(".yaml"):
with open(model_args, "r") as f:
config = yaml.safe_load(f)["model"]
generation_parameters = GenerationParameters.from_dict(config)
model_config = VLLMModelConfig(config, generation_parameters=generation_parameters)

else:
model_args_dict: dict = {k.split("=")[0]: k.split("=")[1] if "=" in k else True for k in model_args.split(",")}
model_config = VLLMModelConfig(**model_args_dict)

pipeline = Pipeline(
tasks=tasks,
Expand Down
42 changes: 31 additions & 11 deletions src/lighteval/models/endpoints/endpoint_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
import logging
import re
import time
from dataclasses import dataclass
from dataclasses import dataclass, replace
from typing import Coroutine, Dict, List, Optional, Union

import requests
Expand All @@ -35,6 +35,7 @@
InferenceEndpoint,
InferenceEndpointError,
InferenceEndpointTimeoutError,
TextGenerationInputGenerateParameters,
TextGenerationInputGrammarType,
TextGenerationOutput,
create_inference_endpoint,
Expand All @@ -48,6 +49,7 @@

from lighteval.data import GenerativeTaskDataset, LoglikelihoodDataset
from lighteval.models.abstract_model import LightevalModel, ModelInfo
from lighteval.models.model_input import GenerationParameters
from lighteval.models.model_output import GenerativeResponse, LoglikelihoodResponse, LoglikelihoodSingleTokenResponse
from lighteval.tasks.requests import (
GreedyUntilRequest,
Expand Down Expand Up @@ -78,6 +80,11 @@
class InferenceModelConfig:
model: str
add_special_tokens: bool = True
generation_parameters: GenerationParameters = None

def __post_init__(self):
if not self.generation_parameters:
self.generation_parameters = GenerationParameters()


@dataclass
Expand All @@ -98,6 +105,7 @@ class InferenceEndpointModelConfig:
namespace: str = None # The namespace under which to launch the endpoint. Defaults to the current user's namespace
image_url: str = None
env_vars: dict = None
generation_parameters: GenerationParameters = None

def __post_init__(self):
# xor operator, one is None but not the other
Expand All @@ -109,6 +117,9 @@ def __post_init__(self):
if not (self.endpoint_name is None) ^ int(self.model_name is None):
raise ValueError("You need to set either endpoint_name or model_name (but not both).")

if not self.generation_parameters:
self.generation_parameters = GenerationParameters()

@classmethod
def from_path(cls, path: str) -> "InferenceEndpointModelConfig":
import yaml
Expand Down Expand Up @@ -290,6 +301,10 @@ def __init__( # noqa: C901
model_dtype=config.model_dtype or "default",
model_size=-1,
)
self.generation_parameters = config.generation_parameters
self.generation_config = TextGenerationInputGenerateParameters(
**self.generation_parameters.to_tgi_inferenceendpoint_dict()
)

@staticmethod
def get_larger_hardware_suggestion(cur_instance_type: str = None, cur_instance_size: str = None):
Expand Down Expand Up @@ -373,16 +388,17 @@ def _async_process_request(
) -> Coroutine[None, list[TextGenerationOutput], str]:
# Todo: add an option to launch with conversational instead for chat prompts
# https://huggingface.co/docs/huggingface_hub/v0.20.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient.conversational
generated_text = self.async_client.text_generation(
prompt=context,
generation_config: TextGenerationInputGenerateParameters = replace(
self.generation_config,
stop=stop_tokens,
max_new_tokens=max_tokens,
details=True,
decoder_input_details=True,
grammar=grammar,
max_new_tokens=max_tokens,
stop_sequences=stop_tokens,
# truncate=,
)

generated_text = self.async_client.text_generation(prompt=context, generation_config=generation_config)

return generated_text

def _process_request(
Expand All @@ -394,14 +410,18 @@ def _process_request(
) -> TextGenerationOutput:
# Todo: add an option to launch with conversational instead for chat prompts
# https://huggingface.co/docs/huggingface_hub/v0.20.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient.conversational
generated_text = self.client.text_generation(
prompt=context,
generation_config: TextGenerationInputGenerateParameters = replace(
self.generation_config,
stop=stop_tokens,
max_new_tokens=max_tokens,
details=True,
decoder_input_details=True,
grammar=grammar,
max_new_tokens=max_tokens,
stop_sequences=stop_tokens,
# truncate=,
)

generated_text = self.client.text_generation(
prompt=context,
generation_config=generation_config,
)

return generated_text
Expand Down
11 changes: 10 additions & 1 deletion src/lighteval/models/endpoints/openai_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
from lighteval.data import GenerativeTaskDataset, LoglikelihoodDataset
from lighteval.models.abstract_model import LightevalModel
from lighteval.models.endpoints.endpoint_model import ModelInfo
from lighteval.models.model_input import GenerationParameters
from lighteval.models.model_output import (
GenerativeResponse,
LoglikelihoodResponse,
Expand Down Expand Up @@ -62,14 +63,21 @@
@dataclass
class OpenAIModelConfig:
model: str
generation_parameters: GenerationParameters = None

def __post_init__(self):
if not self.generation_parameters:
self.generation_parameters = GenerationParameters()


class OpenAIClient(LightevalModel):
_DEFAULT_MAX_LENGTH: int = 4096

def __init__(self, config, env_config) -> None:
def __init__(self, config: OpenAIModelConfig, env_config) -> None:
api_key = os.environ["OPENAI_API_KEY"]
self.client = OpenAI(api_key=api_key)
self.generation_parameters = config.generation_parameters
self.sampling_params = self.generation_parameters.to_vllm_openai_dict()

self.model_info = ModelInfo(
model_name=config.model,
Expand All @@ -96,6 +104,7 @@ def __call_api(self, prompt, return_logits, max_new_tokens, num_samples, logit_b
logprobs=return_logits,
logit_bias=logit_bias,
n=num_samples,
**self.sampling_params,
)
return response
except Exception as e:
Expand Down
Loading
Loading