-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a docstring for VecUnroll #72
Conversation
Please feel free to edit at will! |
Codecov Report
@@ Coverage Diff @@
## master #72 +/- ##
==========================================
- Coverage 56.33% 56.30% -0.04%
==========================================
Files 33 33
Lines 5742 5742
==========================================
- Hits 3235 3233 -2
- Misses 2507 2509 +2
Continue to review full report at Codecov.
|
The other advantage is that it allows interleaving instructions, which can take better advantage of a CPU's OOO: julia> using VectorizationBase, SLEEFPirates
julia> vx = Vec(ntuple(_ -> randn(), pick_vector_width(Float64))...)
Vec{8, Float64}<-0.41777997881994483, -0.2778225158869686, 1.0569948436617753, 0.5176464757147176, 0.2323833794303069, -0.7993648787803678, -0.9685718051710357, 0.12727761225970358>
julia> vxu = VecUnroll((vx,vx,vx,vx));
julia> @btime exp2($(Ref(vx))[])
4.348 ns (0 allocations: 0 bytes)
Vec{8, Float64}<0.7485756477631421, 0.8248350157271293, 2.0805930967801856, 1.431617888324136, 1.174774112297017, 0.5746020803268009, 0.5110116881622035, 1.0922306992794324>
julia> @btime exp2($(Ref(vxu))[])
9.461 ns (0 allocations: 0 bytes)
4 x Vec{8, Float64}
Vec{8, Float64}<0.7485756477631421, 0.8248350157271293, 2.0805930967801856, 1.431617888324136, 1.174774112297017, 0.5746020803268009, 0.5110116881622035, 1.0922306992794324>
Vec{8, Float64}<0.7485756477631421, 0.8248350157271293, 2.0805930967801856, 1.431617888324136, 1.174774112297017, 0.5746020803268009, 0.5110116881622035, 1.0922306992794324>
Vec{8, Float64}<0.7485756477631421, 0.8248350157271293, 2.0805930967801856, 1.431617888324136, 1.174774112297017, 0.5746020803268009, 0.5110116881622035, 1.0922306992794324>
Vec{8, Float64}<0.7485756477631421, 0.8248350157271293, 2.0805930967801856, 1.431617888324136, 1.174774112297017, 0.5746020803268009, 0.5110116881622035, 1.0922306992794324>
julia> @btime sin($(Ref(vx))[])
8.536 ns (0 allocations: 0 bytes)
Vec{8, Float64}<-0.40573237311580634, -0.2742623121079716, 0.8708824084181056, 0.49483633008673605, 0.2302974902834125, -0.7169134530311316, -0.8240775144694545, 0.1269342496262836>
julia> @btime sin($(Ref(vxu))[])
21.154 ns (0 allocations: 0 bytes)
4 x Vec{8, Float64}
Vec{8, Float64}<-0.40573237311580634, -0.2742623121079716, 0.8708824084181056, 0.49483633008673605, 0.2302974902834125, -0.7169134530311316, -0.8240775144694545, 0.1269342496262836>
Vec{8, Float64}<-0.40573237311580634, -0.2742623121079716, 0.8708824084181056, 0.49483633008673605, 0.2302974902834125, -0.7169134530311316, -0.8240775144694545, 0.1269342496262836>
Vec{8, Float64}<-0.40573237311580634, -0.2742623121079716, 0.8708824084181056, 0.49483633008673605, 0.2302974902834125, -0.7169134530311316, -0.8240775144694545, 0.1269342496262836>
Vec{8, Float64}<-0.40573237311580634, -0.2742623121079716, 0.8708824084181056, 0.49483633008673605, 0.2302974902834125, -0.7169134530311316, -0.8240775144694545, 0.1269342496262836> In these example, 4x the work takes less than 2.5x the time. |
Do you want me to change the docstring in some way? I do use the word |
The test that failed is
I agree, that'd be long.
It should be usable wherever other |
Closes #71