Is it possible to support the transformer engine when using Lora in Megatron? #2260

liulong11 · 2024-12-05T03:24:15Z

Feature request

I am currently using the Megatron framework and want to use Lora for training. I saw that the Megatron format is supported at https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/tp_layer.py RowParallelLinear and ColumnParallelLinear do the adaptation. But if I use the transformer engine, the corresponding TELayerNormColumnParallelLinear and TERowParallelLinear will not be adapted.

Motivation

This will better support Megatron framework using LoRA.

Your contribution

I don't have a PR.

BenjaminBossan · 2024-12-05T10:21:06Z

Unfortunately, I don't have any experience with this so I can't really give any tips. Do you have a reproducer for this issue? Also a gentle ping @zhangsheng377 just in case.

zhangsheng377 · 2024-12-05T12:14:33Z

Ha, actually I have never used TE directly. But we reserved backend when writing distributed LoRA. You can see if you can specify it as your TE. If it is inconvenient, you can consider using an adaptation layer class to modify the model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to support the transformer engine when using Lora in Megatron? #2260

Is it possible to support the transformer engine when using Lora in Megatron? #2260

liulong11 commented Dec 5, 2024

BenjaminBossan commented Dec 5, 2024

zhangsheng377 commented Dec 5, 2024

Is it possible to support the transformer engine when using Lora in Megatron? #2260

Is it possible to support the transformer engine when using Lora in Megatron? #2260

Comments

liulong11 commented Dec 5, 2024

Feature request

Motivation

Your contribution

BenjaminBossan commented Dec 5, 2024

zhangsheng377 commented Dec 5, 2024