Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paper for GLU Mult Bias? #275

Open
TimS-ml opened this issue Oct 4, 2024 · 3 comments
Open

paper for GLU Mult Bias? #275

TimS-ml opened this issue Oct 4, 2024 · 3 comments

Comments

@TimS-ml
Copy link

TimS-ml commented Oct 4, 2024

Hi:

Is GLU's mult_bias originally from this paper? https://arxiv.org/pdf/2202.08906 It mentioned Add Bias and Mult Bias on page 32. I could not find the info in README, and I am not very sure.

Thanks! 😀

@lucidrains
Copy link
Owner

@TimS-ml hey Tim, yes that paper corroborates that technique, but i think it originated from another earlier work within google brain (could be wrong)

@TimS-ml
Copy link
Author

TimS-ml commented Oct 30, 2024

@lucidrains Thanks! Should we add the original paper to the README? Let me try if I can find that paper.
BTW, I have another small question - like right now the AttentionLayer takes more than 55 input parameters, and I'm guessing this number's only gonna go up as we implement more papers. Are there any software design patterns or something we could use to make the codebase easier to maintain? Like maybe grouping the input params further?

@lucidrains
Copy link
Owner

@TimS-ml it actually won't go up as fast as you think

very few techniques made it, and if anything, i will probably start removing certain ideas in coming releases

re: citation, yes, let us cite that if you can find it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants