-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我们如何针对扩展区块微调? #3
Comments
@hills-code How do we train for extended blocks? |
感谢关注! 我把训练代码上传了在这个repo下面了,也可以查看https://github.com/hills-code/open-instruct/tree/llama-pro |
这个项目是SFT的训练,在这个阶段是所有参数一起训练,和普通的SFT是一致的;在Pretrain的时候会冻结参数,具体的操作在这里https://github.com/hills-code/open-instruct/blob/7c2b14d3d319028c68657946ca2c16b248f866e8/open_instruct/customized_trainer.py#L53 |
我看了customized_trainer.py这个文件,但是这个文件只是用于扩展模型,也就是把一个7B的模型变成一个8B的模型,但是增加的块都是初始状态,没有被训练。有没有针对这些新增块(也就是冻结所有原始块)进行PT轮训练的DEMO |
我们如何针对扩展区块微调?
The text was updated successfully, but these errors were encountered: