Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for CUDA (New Subpackage) #156

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ChantalJuntao
Copy link

While working on improving the speed of my image registration analysis, I've noticed that ImageTransformations doesn't currently support compatibility for GPU computing using CuArrays. Therefore, I've made a first attempt at a GPU version of ImageTransformations.warp that supports CuArrays that can be wrapped in Offset Arrays and defaults to linear b-spline interpolation.

I'm still working on trying to get this to use the actual Interpolations package to support more functionality, and my test suite is still quite incomplete. I'm also not sure if this will work on grayscale or colored images yet. Any help or feedback would be extremely appreciated.

@codecov
Copy link

codecov bot commented Jun 7, 2022

Codecov Report

Merging #156 (c565dfe) into master (631d63b) will decrease coverage by 0.18%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #156      +/-   ##
==========================================
- Coverage   90.00%   89.81%   -0.19%     
==========================================
  Files           8        8              
  Lines         220      216       -4     
==========================================
- Hits          198      194       -4     
  Misses         22       22              
Impacted Files Coverage Δ
src/autorange.jl 93.18% <0.00%> (-0.16%) ⬇️
src/warp.jl 96.29% <0.00%> (-0.14%) ⬇️
src/resizing.jl 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 631d63b...c565dfe. Read the comment docs.

Copy link
Member

@johnnychen94 johnnychen94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's better to leave this to @timholy (I guess you're in the same lab?) I'll just leave some general comments here.

On the code design. Because it's very likely that we wouldn't have CUDA support for all methods for a long period (will we?), I think a better way to design
the code is to opt-in the CUDA support when we know the interpolation method has CUDA support and otherwise fallback to the CPU. We can have a private cuwarp! function and then call cuwarp! from warp! when certain condition meets (i.e., when we do support CUDA).

On performance, I'd be very interested if this matches or beats the NNLibCUDA performance https://github.com/FluxML/NNlibCUDA.jl/blob/da73f077371a5f85660f6e7620099d74ddd11b6e/src/upsample.jl#L152-L194

Comment on lines +18 to +21
img_inds = map(out->out.I, CartesianIndices(axes(out.parent)))
tform_offset = offset_calc(out, in_offsets, tform)

tformindex = CuArray(tform_offset.(SVector.(img_inds)))
Copy link
Member

@johnnychen94 johnnychen94 Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calculates all image indexes in the single thread CPU and converts them to GPU. I believe this is unnecessarily inefficient -- I think we need to figure out how to directly generate the target indexes on GPU. Maybe manually write a small CUDA kernel for this.

@timholy
Copy link
Member

timholy commented Jun 16, 2022

FWIW we agree this needs additional work but is not high priority, as it turns out that warping is not a real bottleneck in the analysis we were planning to optimize. To be continued later. In the meantime, let's leave this open in case it inspires someone else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants