distilabel base pipeline code fails #43

ouhenio · 2024-12-27T19:11:19Z

Hi!

I'm trying to run the pipeline.py code inside of /distilabel_pipelines/magpie_ultra_v1.

I created a conda environment with python 3.10, installed distilabel with the mentioned dependencies:

pip install distilabel[ray,vllm,sentence-transformers,faiss-cpu,hf-transformers]

And ran python pipeline.py.

It failed with the following error:

│ /home/ouhenio/miniconda3/envs/instructor/lib/python3.10/site-packages/distilabel/mixins/signatur │
│ e.py:74 in flatten_dump                                                                          │
│                                                                                                  │
│   71 │   │   │   │   │   │   items.append((new_key, "-".join(map(str, v))))                      │
│   72 │   │   │   │   │   else:                                                                   │
│   73 │   │   │   │   │   │   for i, x in enumerate(v):                                           │
│ ❱ 74 │   │   │   │   │   │   │   items.extend(flatten_dump(x, f"{new_key}-{i}", sep=sep))        │
│   75 │   │   │   │   elif new_key not in self.exclude_from_signature:                            │
│   76 │   │   │   │   │   items.append((new_key, v))                                              │
│   77 │   │   │   return items                                                                    │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │          d = {                                                                               │ │
│ │              │   'temperature': 0.8,                                                         │ │
│ │              │   'top_p': 1.0,                                                               │ │
│ │              │   'max_new_tokens': 1024,                                                     │ │
│ │              │   'stop': [                                                                   │ │
│ │              │   │   '<|eot_id|>',                                                           │ │
│ │              │   │   '<|end_of_text|>',                                                      │ │
│ │              │   │   '<|start_header_id|>',                                                  │ │
│ │              │   │   '<|end_header_id|>'                                                     │ │
│ │              │   ],                                                                          │ │
│ │              │   'stop_token_ids': [128009, 128001, 128006, 128007],                         │ │
│ │              │   'logits_processors': [                                                      │ │
│ │              │   │   <function de_md_logits_processor_for_llama3_1 at 0x7f4350587ac0>        │ │
│ │              │   ]                                                                           │ │
│ │              }                                                                               │ │
│ │          i = 0                                                                               │ │
│ │      items = [                                                                               │ │
│ │              │   ('llm_generation_kwargs_temperature', 0.8),                                 │ │
│ │              │   ('llm_generation_kwargs_top_p', 1.0),                                       │ │
│ │              │   ('llm_generation_kwargs_max_new_tokens', 1024),                             │ │
│ │              │   (                                                                           │ │
│ │              │   │   'llm_generation_kwargs_stop',                                           │ │
│ │              │   │   '<|eot_id|>-<|end_of_text|>-<|start_header_id|>-<|end_header_id|>'      │ │
│ │              │   ),                                                                          │ │
│ │              │   ('llm_generation_kwargs_stop_token_ids', '128009-128001-128006-128007')     │ │
│ │              ]                                                                               │ │
│ │          k = 'logits_processors'                                                             │ │
│ │    new_key = 'llm_generation_kwargs_logits_processors'                                       │ │
│ │ parent_key = 'llm_generation_kwargs'                                                         │ │
│ │       self = MagpieGenerator(                                                                │ │
│ │              │   exclude_from_signature={                                                    │ │
│ │              │   │   'llm_offline_batch_generation_block_until_done',                        │ │
│ │              │   │   'disable_cuda_device_placement',                                        │ │
│ │              │   │   'type_info',                                                            │ │
│ │              │   │   'llm_jobs_ids',                                                         │ │
│ │              │   │   'exclude_from_signature',                                               │ │
│ │              │   │   'gpu_memory_utilization',                                               │ │
│ │              │   │   'resources',                                                            │ │
│ │              │   │   'input_batch_size'                                                      │ │
│ │              │   },                                                                          │ │
│ │              │   llm=vLLM(                                                                   │ │
│ │              │   │   cuda_devices='auto',                                                    │ │
│ │              │   │   disable_cuda_device_placement=False,                                    │ │
│ │              │   │   use_magpie_template=True,                                               │ │
│ │              │   │                                                                           │ │
│ │              magpie_pre_query_template='<|begin_of_text|><|start_header_id|>user<|end_heade… │ │
│ │              │   │   generation_kwargs={                                                     │ │
│ │              │   │   │   'temperature': 0.8,                                                 │ │
│ │              │   │   │   'top_p': 1.0,                                                       │ │
│ │              │   │   │   'max_new_tokens': 1024,                                             │ │
│ │              │   │   │   'stop': [                                                           │ │
│ │              │   │   │   │   '<|eot_id|>',                                                   │ │
│ │              │   │   │   │   '<|end_of_text|>',                                              │ │
│ │              │   │   │   │   '<|start_header_id|>',                                          │ │
│ │              │   │   │   │   '<|end_header_id|>'                                             │ │
│ │              │   │   │   ],                                                                  │ │
│ │              │   │   │   'stop_token_ids': [128009, 128001, 128006, 128007],                 │ │
│ │              │   │   │   'logits_processors': [                                              │ │
│ │              │   │   │   │   <function de_md_logits_processor_for_llama3_1 at                │ │
│ │              0x7f4350587ac0>                                                                 │ │
│ │              │   │   │   ]                                                                   │ │
│ │              │   │   },                                                                      │ │
│ │              │   │   use_offline_batch_generation=False,                                     │ │
│ │              │   │   offline_batch_generation_block_until_done=None,                         │ │
│ │              │   │   jobs_ids=None,                                                          │ │
│ │              │   │   model='meta-llama/Meta-Llama-3.1-405B-Instruct-FP8',                    │ │
│ │              │   │   dtype='auto',                                                           │ │
│ │              │   │   trust_remote_code=False,                                                │ │
│ │              │   │   quantization=None,                                                      │ │
│ │              │   │   revision=None,                                                          │ │
│ │              │   │   tokenizer='meta-llama/Meta-Llama-3.1-405B-Instruct-FP8',                │ │
│ │              │   │   tokenizer_mode='auto',                                                  │ │
│ │              │   │   tokenizer_revision=None,                                                │ │
│ │              │   │   skip_tokenizer_init=False,                                              │ │
│ │              │   │   chat_template=None,                                                     │ │
│ │              │   │   seed=0,                                                                 │ │
│ │              │   │   extra_kwargs={                                                          │ │
│ │              │   │   │   'tensor_parallel_size': 8,                                          │ │
│ │              │   │   │   'max_model_len': 8192,                                              │ │
│ │              │   │   │   'enable_prefix_caching': True                                       │ │
│ │              │   │   },                                                                      │ │
│ │              │   │   structured_output=None                                                  │ │
│ │              │   ),                                                                          │ │
│ │              │   n_turns=3,                                                                  │ │
│ │              │   end_with_user=False,                                                        │ │
│ │              │   include_system_prompt=False,                                                │ │
│ │              │   only_instruction=False,                                                     │ │
│ │              │   system_prompt={                                                             │ │
│ │              │   │   'information-seeking': (                                                │ │
│ │              │   │   │   'You are an AI assistant designed to provide accurate and concise   │ │
│ │              information on '+898,                                                           │ │
│ │              │   │   │   0.05                                                                │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'reasoning': (                                                          │ │
│ │              │   │   │   'You are an AI assistant specialized in logical thinking and        │ │
│ │              problem-solving. The'+1021,                                                     │ │
│ │              │   │   │   0.125                                                               │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'planning': (                                                           │ │
│ │              │   │   │   'You are an AI assistant focused on helping users create effective  │ │
│ │              plans and stra'+987,                                                            │ │
│ │              │   │   │   0.05                                                                │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'editing': (                                                            │ │
│ │              │   │   │   'You are an AI assistant specialized in editing and improving       │ │
│ │              written content. Th'+924,                                                       │ │
│ │              │   │   │   0.1                                                                 │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'coding': (                                                             │ │
│ │              │   │   │   'You are an AI assistant designed to help with programming tasks.   │ │
│ │              The user will '+1002,                                                           │ │
│ │              │   │   │   0.125                                                               │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'math': (                                                               │ │
│ │              │   │   │   'You are an AI assistant specializing in mathematics, capable of    │ │
│ │              addressing quest'+2081,                                                         │ │
│ │              │   │   │   0.125                                                               │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'role-playing': (                                                       │ │
│ │              │   │   │   'You are an AI assistant capable of engaging in various             │ │
│ │              role-playing scenarios. T'+950,                                                 │ │
│ │              │   │   │   0.1                                                                 │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'data-analysis': (                                                      │ │
│ │              │   │   │   'You are an AI assistant specialized in data analysis and           │ │
│ │              interpretation.  The us'+1081,                                                  │ │
│ │              │   │   │   0.125                                                               │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'creative-writing': (                                                   │ │
│ │              │   │   │   'You are an AI assistant designed to support creative writing       │ │
│ │              endeavors.  The use'+1055,                                                      │ │
│ │              │   │   │   0.1                                                                 │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'advice-seeking': (                                                     │ │
│ │              │   │   │   'You are an AI assistant focused on providing thoughtful advice and │ │
│ │              guidance. The'+1008,                                                            │ │
│ │              │   │   │   0.05                                                                │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   ... +1                                                                  │ │
│ │              │   },                                                                          │ │
│ │              │   name='magpie_generator_0',                                                  │ │
│ │              │   resources=StepResources(                                                    │ │
│ │              │   │   replicas=1,                                                             │ │
│ │              │   │   cpus=None,                                                              │ │
│ │              │   │   gpus=8,                                                                 │ │
│ │              │   │   memory=None,                                                            │ │
│ │              │   │   resources=None                                                          │ │
│ │              │   ),                                                                          │ │
│ │              │   input_mappings={},                                                          │ │
│ │              │   output_mappings={},                                                         │ │
│ │              │   use_cache=True,                                                             │ │
│ │              │   batch_size=250,                                                             │ │
│ │              │   group_generations=False,                                                    │ │
│ │              │   add_raw_output=True,                                                        │ │
│ │              │   add_raw_input=True,                                                         │ │
│ │              │   num_generations=1,                                                          │ │
│ │              │   use_default_structured_output=False,                                        │ │
│ │              │   num_rows=1000                                                               │ │
│ │              )                                                                               │ │
│ │        sep = '_'                                                                             │ │
│ │          v = [<function de_md_logits_processor_for_llama3_1 at 0x7f4350587ac0>]              │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ /home/ouhenio/miniconda3/envs/instructor/lib/python3.10/site-packages/distilabel/mixins/signatur │
│ e.py:63 in flatten_dump                                                                          │
│                                                                                                  │
│   60 │   │                                                                                       │
│   61 │   │   def flatten_dump(d: Any, parent_key: str = "", sep: str = "_") -> List:             │
│   62 │   │   │   items = []                                                                      │
│ ❱ 63 │   │   │   for k, v in d.items():                                                          │
│   64 │   │   │   │   new_key = parent_key + sep + k if parent_key else k                         │
│   65 │   │   │   │   if isinstance(v, dict):                                                     │
│   66 │   │   │   │   │   items.extend(flatten_dump(v, new_key, sep=sep))                         │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │      items = []                                                                              │ │
│ │ parent_key = 'llm_generation_kwargs_logits_processors-0'                                     │ │
│ │       self = MagpieGenerator(                                                                │ │
│ │              │   exclude_from_signature={                                                    │ │
│ │              │   │   'llm_offline_batch_generation_block_until_done',                        │ │
│ │              │   │   'disable_cuda_device_placement',                                        │ │
│ │              │   │   'type_info',                                                            │ │
│ │              │   │   'llm_jobs_ids',                                                         │ │
│ │              │   │   'exclude_from_signature',                                               │ │
│ │              │   │   'gpu_memory_utilization',                                               │ │
│ │              │   │   'resources',                                                            │ │
│ │              │   │   'input_batch_size'                                                      │ │
│ │              │   },                                                                          │ │
│ │              │   llm=vLLM(                                                                   │ │
│ │              │   │   cuda_devices='auto',                                                    │ │
│ │              │   │   disable_cuda_device_placement=False,                                    │ │
│ │              │   │   use_magpie_template=True,                                               │ │
│ │              │   │                                                                           │ │
│ │              magpie_pre_query_template='<|begin_of_text|><|start_header_id|>user<|end_heade… │ │
│ │              │   │   generation_kwargs={                                                     │ │
│ │              │   │   │   'temperature': 0.8,                                                 │ │
│ │              │   │   │   'top_p': 1.0,                                                       │ │
│ │              │   │   │   'max_new_tokens': 1024,                                             │ │
│ │              │   │   │   'stop': [                                                           │ │
│ │              │   │   │   │   '<|eot_id|>',                                                   │ │
│ │              │   │   │   │   '<|end_of_text|>',                                              │ │
│ │              │   │   │   │   '<|start_header_id|>',                                          │ │
│ │              │   │   │   │   '<|end_header_id|>'                                             │ │
│ │              │   │   │   ],                                                                  │ │
│ │              │   │   │   'stop_token_ids': [128009, 128001, 128006, 128007],                 │ │
│ │              │   │   │   'logits_processors': [                                              │ │
│ │              │   │   │   │   <function de_md_logits_processor_for_llama3_1 at                │ │
│ │              0x7f4350587ac0>                                                                 │ │
│ │              │   │   │   ]                                                                   │ │
│ │              │   │   },                                                                      │ │
│ │              │   │   use_offline_batch_generation=False,                                     │ │
│ │              │   │   offline_batch_generation_block_until_done=None,                         │ │
│ │              │   │   jobs_ids=None,                                                          │ │
│ │              │   │   model='meta-llama/Meta-Llama-3.1-405B-Instruct-FP8',                    │ │
│ │              │   │   dtype='auto',                                                           │ │
│ │              │   │   trust_remote_code=False,                                                │ │
│ │              │   │   quantization=None,                                                      │ │
│ │              │   │   revision=None,                                                          │ │
│ │              │   │   tokenizer='meta-llama/Meta-Llama-3.1-405B-Instruct-FP8',                │ │
│ │              │   │   tokenizer_mode='auto',                                                  │ │
│ │              │   │   tokenizer_revision=None,                                                │ │
│ │              │   │   skip_tokenizer_init=False,                                              │ │
│ │              │   │   chat_template=None,                                                     │ │
│ │              │   │   seed=0,                                                                 │ │
│ │              │   │   extra_kwargs={                                                          │ │
│ │              │   │   │   'tensor_parallel_size': 8,                                          │ │
│ │              │   │   │   'max_model_len': 8192,                                              │ │
│ │              │   │   │   'enable_prefix_caching': True                                       │ │
│ │              │   │   },                                                                      │ │
│ │              │   │   structured_output=None                                                  │ │
│ │              │   ),                                                                          │ │
│ │              │   n_turns=3,                                                                  │ │
│ │              │   end_with_user=False,                                                        │ │
│ │              │   include_system_prompt=False,                                                │ │
│ │              │   only_instruction=False,                                                     │ │
│ │              │   system_prompt={                                                             │ │
│ │              │   │   'information-seeking': (                                                │ │
│ │              │   │   │   'You are an AI assistant designed to provide accurate and concise   │ │
│ │              information on '+898,                                                           │ │
│ │              │   │   │   0.05                                                                │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'reasoning': (                                                          │ │
│ │              │   │   │   'You are an AI assistant specialized in logical thinking and        │ │
│ │              problem-solving. The'+1021,                                                     │ │
│ │              │   │   │   0.125                                                               │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'planning': (                                                           │ │
│ │              │   │   │   'You are an AI assistant focused on helping users create effective  │ │
│ │              plans and stra'+987,                                                            │ │
│ │              │   │   │   0.05                                                                │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'editing': (                                                            │ │
│ │              │   │   │   'You are an AI assistant specialized in editing and improving       │ │
│ │              written content. Th'+924,                                                       │ │
│ │              │   │   │   0.1                                                                 │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'coding': (                                                             │ │
│ │              │   │   │   'You are an AI assistant designed to help with programming tasks.   │ │
│ │              The user will '+1002,                                                           │ │
│ │              │   │   │   0.125                                                               │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'math': (                                                               │ │
│ │              │   │   │   'You are an AI assistant specializing in mathematics, capable of    │ │
│ │              addressing quest'+2081,                                                         │ │
│ │              │   │   │   0.125                                                               │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'role-playing': (                                                       │ │
│ │              │   │   │   'You are an AI assistant capable of engaging in various             │ │
│ │              role-playing scenarios. T'+950,                                                 │ │
│ │              │   │   │   0.1                                                                 │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'data-analysis': (                                                      │ │
│ │              │   │   │   'You are an AI assistant specialized in data analysis and           │ │
│ │              interpretation.  The us'+1081,                                                  │ │
│ │              │   │   │   0.125                                                               │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'creative-writing': (                                                   │ │
│ │              │   │   │   'You are an AI assistant designed to support creative writing       │ │
│ │              endeavors.  The use'+1055,                                                      │ │
│ │              │   │   │   0.1                                                                 │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   'advice-seeking': (                                                     │ │
│ │              │   │   │   'You are an AI assistant focused on providing thoughtful advice and │ │
│ │              guidance. The'+1008,                                                            │ │
│ │              │   │   │   0.05                                                                │ │
│ │              │   │   ),                                                                      │ │
│ │              │   │   ... +1                                                                  │ │
│ │              │   },                                                                          │ │
│ │              │   name='magpie_generator_0',                                                  │ │
│ │              │   resources=StepResources(                                                    │ │
│ │              │   │   replicas=1,                                                             │ │
│ │              │   │   cpus=None,                                                              │ │
│ │              │   │   gpus=8,                                                                 │ │
│ │              │   │   memory=None,                                                            │ │
│ │              │   │   resources=None                                                          │ │
│ │              │   ),                                                                          │ │
│ │              │   input_mappings={},                                                          │ │
│ │              │   output_mappings={},                                                         │ │
│ │              │   use_cache=True,                                                             │ │
│ │              │   batch_size=250,                                                             │ │
│ │              │   group_generations=False,                                                    │ │
│ │              │   add_raw_output=True,                                                        │ │
│ │              │   add_raw_input=True,                                                         │ │
│ │              │   num_generations=1,                                                          │ │
│ │              │   use_default_structured_output=False,                                        │ │
│ │              │   num_rows=1000                                                               │ │
│ │              )                                                                               │ │
│ │        sep = '_'                                                                             │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'function' object has no attribute 'items'

Any idea what could be going on? I'm using a node with 8A100s.

The text was updated successfully, but these errors were encountered:

ouhenio · 2024-12-27T19:45:17Z

Ok, after looking at the current state of magpie-align I deleted de_md_logits_processor_for_llama3_1 and the following function call.

At least now it runs, although I don't know if the results will be affected by this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

distilabel base pipeline code fails #43

distilabel base pipeline code fails #43

ouhenio commented Dec 27, 2024

ouhenio commented Dec 27, 2024 •

edited

Loading

distilabel base pipeline code fails #43

distilabel base pipeline code fails #43

Comments

ouhenio commented Dec 27, 2024

ouhenio commented Dec 27, 2024 • edited Loading

ouhenio commented Dec 27, 2024 •

edited

Loading