Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to support the transformer engine when using Lora in Megatron? #2260

Open
liulong11 opened this issue Dec 5, 2024 · 2 comments

Comments

@liulong11
Copy link

Feature request

I am currently using the Megatron framework and want to use Lora for training. I saw that the Megatron format is supported at https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/tp_layer.py RowParallelLinear and ColumnParallelLinear do the adaptation. But if I use the transformer engine, the corresponding TELayerNormColumnParallelLinear and TERowParallelLinear will not be adapted.

Motivation

This will better support Megatron framework using LoRA.

Your contribution

I don't have a PR.

@BenjaminBossan
Copy link
Member

Unfortunately, I don't have any experience with this so I can't really give any tips. Do you have a reproducer for this issue? Also a gentle ping @zhangsheng377 just in case.

@zhangsheng377
Copy link
Contributor

Ha, actually I have never used TE directly. But we reserved backend when writing distributed LoRA. You can see if you can specify it as your TE. If it is inconvenient, you can consider using an adaptation layer class to modify the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants