paper for GLU Mult Bias? #275

TimS-ml · 2024-10-04T07:25:44Z

Hi:

Is GLU's mult_bias originally from this paper? https://arxiv.org/pdf/2202.08906 It mentioned Add Bias and Mult Bias on page 32. I could not find the info in README, and I am not very sure.

Thanks! 😀

lucidrains · 2024-10-28T14:24:04Z

@TimS-ml hey Tim, yes that paper corroborates that technique, but i think it originated from another earlier work within google brain (could be wrong)

TimS-ml · 2024-10-30T09:00:20Z

@lucidrains Thanks! Should we add the original paper to the README? Let me try if I can find that paper.
BTW, I have another small question - like right now the AttentionLayer takes more than 55 input parameters, and I'm guessing this number's only gonna go up as we implement more papers. Are there any software design patterns or something we could use to make the codebase easier to maintain? Like maybe grouping the input params further?

lucidrains · 2024-10-30T13:44:13Z

@TimS-ml it actually won't go up as fast as you think

very few techniques made it, and if anything, i will probably start removing certain ideas in coming releases

re: citation, yes, let us cite that if you can find it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paper for GLU Mult Bias? #275

paper for GLU Mult Bias? #275

TimS-ml commented Oct 4, 2024

lucidrains commented Oct 28, 2024

TimS-ml commented Oct 30, 2024

lucidrains commented Oct 30, 2024

paper for GLU Mult Bias? #275

paper for GLU Mult Bias? #275

Comments

TimS-ml commented Oct 4, 2024

lucidrains commented Oct 28, 2024

TimS-ml commented Oct 30, 2024

lucidrains commented Oct 30, 2024