-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial support for RWKV v6 #174
Conversation
sequence inferencing doesn't work correctly yet Signed-off-by: Molly Sophia <[email protected]>
Thanks to @cryscan Signed-off-by: Molly Sophia <[email protected]>
Signed-off-by: Molly Sophia <[email protected]>
HI!
Testing
No one can expect from an engineer to test every single combination after each commit. I definitely did not expect if from myself, so that's why these tests were added. I would strongly recommend training Unfortunately, the code that I've used to train |
Hi,
The tiny model can output correct words, so I guess this is enough? |
Update 1: I've corrected some buggy code and now the diff sums of sanity checks don't look insane now :D |
The expected_difference_sum values needs to be determined, after making sure FP32 logits are nearly the same. Signed-off-by: Molly Sophia <[email protected]>
e7a1f2a
to
d1f95d0
Compare
Signed-off-by: Molly Sophia <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest some minor code style improvements,
Great work! There are couple of things remaining:
I will be able to approve the PR, but I'm not sure I can merge it. 4 months ago I've given the repository to the RWKV Foundation and stepped down as a maintainer. Last time I checked, @LaylBongers was the new maintainer. In any case, I think the PR should be reviewed and merged by a new maintainer, whoever they are. |
Thanks! I’ll do the changes later today. |
Signed-off-by: Molly Sophia <[email protected]>
@saharNooby Hi! I've applied some changes mentioned above.
Regarding this, the assertion happens at:
Using lldb, I can see that it happens in FFN: mul_mat(vw, k), where vw has the shape [n_embed, dim_ffn=int((n_embd * 3.5) // 32 * 32)]. I wonder what's the best solution here? Re-train a tiny-rwkv v6 with n_embed = 128, or fix ggml? |
There is a concern that this assertion also fails for larger RWKV v6 models. If so, it would mean macOS inference is broken. Is it possible for you to verify it? (probably just need to know relevant dims for the models) If larger models are OK, then I think retraining |
Although if this |
It should be okay if the dims of these operands of matmuls are all even multiples of 32. For ChannelMixing weights, there won't be any problem as long as n_embed >= 128. dim_att is equal to n_embed by default, which seems to be okay for all these existing models for now (n_embed = [512, 1024, 2048, 4096]).
Unfortunately, it isn't fixed upstream either :P |
I just tested RWKV v6 1.6B/3B/7B with Q5_0/Q5_1 on my M1 MacBook, and they all generate completions normally without failing the assertion. So I guess I can just re-train a tiny-rwkv with n_embed=128 tomorrow :P Maybe fix ggml someday when an irregular rwkv6 is released. |
Signed-off-by: Molly Sophia <[email protected]>
Signed-off-by: Molly Sophia <[email protected]>
I've looked over the code and ran a smoke test. Everything looks good to me. I see you've already included a tiny-rwkv by now for the tests. If there's nothing else left I'll have it merged in. |
I guess there's nothing else left for now? |
Add initial support for RWKV v6.
Tested sequence prefill and normal inference with RWKV v6 1.6B and it gets exactly the same texts with rwkv-pip-package when generating with FP32, top_k=0.
Precise testing of differences of logits is not done yet.
Regarding the tests, I wonder if the training of a tiny-rwkv-v6 is needed (since rwkv-x060-173m-pile is still a bit large for testing use) ?
Edit: I'm new to ggml :P Feel free to point out anything optimizable