deepspeed zero3 how to save custom model？ #3309

NLPJCL · 2024-12-21T17:01:17Z

DeepSpeedEngine(
(module): LLMDecoder(
(model): Qwen2ForSequenceClassification(
(model): Qwen2Model(
(embed_tokens): Embedding(151936, 1536)
(layers): ModuleList(
(0-27): 28 x Qwen2DecoderLayer(
(self_attn): Qwen2SdpaAttention(
(q_proj): Linear(in_features=1536, out_features=1536, bias=True)
(k_proj): Linear(in_features=1536, out_features=256, bias=True)
(v_proj): Linear(in_features=1536, out_features=256, bias=True)
(o_proj): Linear(in_features=1536, out_features=1536, bias=False)
(rotary_emb): Qwen2RotaryEmbedding()
)
(mlp): Qwen2MLP(
(gate_proj): Linear(in_features=1536, out_features=8960, bias=False)
(up_proj): Linear(in_features=1536, out_features=8960, bias=False)
(down_proj): Linear(in_features=8960, out_features=1536, bias=False)
(act_fn): SiLU()
)
(input_layernorm): Qwen2RMSNorm((0,), eps=1e-06)
(post_attention_layernorm): Qwen2RMSNorm((0,), eps=1e-06)
)
)
(norm): Qwen2RMSNorm((0,), eps=1e-06)
(rotary_emb): Qwen2RotaryEmbedding()
)
(score): Linear(in_features=1536, out_features=1, bias=False)
)
)
)
Hello, the above is my model structure. In short, I use a custom LLMDecoder, which has a variable named model which is a Qwen2ForSequenceClassification object.
In this case, how should I save the model in deepspeed zero3?

The following code is not suitable for my model structure, how should I modify it?

unwrapped_model = accelerator.unwrap_model(model)
unwrapped_model.save_pretrained(
args.output_dir,
is_main_process=accelerator.is_main_process,
save_function=accelerator.save,
state_dict=accelerator.get_state_dict(model),
)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deepspeed zero3 how to save custom model？ #3309

deepspeed zero3 how to save custom model？ #3309

NLPJCL commented Dec 21, 2024 •

edited

Loading

deepspeed zero3 how to save custom model？ #3309

deepspeed zero3 how to save custom model？ #3309

Comments

NLPJCL commented Dec 21, 2024 • edited Loading

NLPJCL commented Dec 21, 2024 •

edited

Loading