-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add im_from_matlab
and im_to_matlab
#179
base: master
Are you sure you want to change the base?
Conversation
src/matlab.jl
Outdated
function _im_from_matlab(::Type{CT}, X::AbstractArray{T}) where {CT<:Colorant, T<:Real} | ||
_CT = isconcretetype(CT) ? CT : base_colorant_type(CT){T} | ||
# FIXME(johnnychen94): not type inferrable here | ||
return StructArray{_CT}(X; dims=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of colorview(RGB, PermutedDimsArray(X, (3, 1, 2)))
which losses the information that memory layout is struct of array, this PR introduces StructArrays
so that other functions have the potential to provide an optimized implementation for this memory layout.
Codecov Report
@@ Coverage Diff @@
## master #179 +/- ##
==========================================
- Coverage 62.60% 61.39% -1.21%
==========================================
Files 10 11 +1
Lines 583 658 +75
==========================================
+ Hits 365 404 +39
- Misses 218 254 +36
Continue to review full report at Codecov.
|
638a296
to
68e46bc
Compare
TODO: support Matlab indexed image format
|
TODO:
|
@timholy I notice you've self-assigned a review. This PR is basically ready for review, except that I still need to 1) revisit and improve the docstring, and 2) rebase the commits. In a summary, this PR:
The loading time, however, increases from 0.58 seconds to 0.98 seconds even in the very decent i9-12900K CPU, among the time This PR originates from the need to provide API compatibility to MATLAB's toolbox. When converting from MATLAB layout, because we don't have colorspace information, we have to guess from the input:
when converting to MATLAB layout:
As an example, here's the benchmark between various versions of the trivial
Conclusions:
Size (256, 256)
Size (512, 512)
Size (1024, 1024)
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: 12th Gen Intel(R) Core(TM) i9-12900K
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, goldmont) scriptsusing ImageCore, MappedArrays, BenchmarkTools
rgb2gray(I) = im_to_matlab(of_eltype(Gray, im_from_matlab(I)))
function rgb2gray_direct(I)
@. @views 0.2989 * I[:, :, 1] + 0.5870 * I[:, :, 2] + 0.1140 * I[:, :, 3]
end
# size 256×256
# MATLAB baseline: 23.170 μs
I = rand(256, 256, 3)
img = collect(im_from_matlab(I))
@btime Gray.($img); # 32.042 μs (2 allocations: 512.05 KiB)
@btime collect(rgb2gray($I)); # 38.963 μs (13 allocations: 512.83 KiB)
@btime rgb2gray_direct($I); # 33.386 μs (2 allocations: 512.05 KiB)
# size 512×512
# MATLAB baseline: 546.576 μs
I = rand(512, 512, 3)
img = collect(im_from_matlab(I))
@btime Gray.($img); # 143.831 μs (2 allocations: 2.00 MiB)
@btime collect(rgb2gray($I)); # 155.726 μs (13 allocations: 2.00 MiB)
@btime rgb2gray_direct($I); # 151.395 μs (2 allocations: 2.00 MiB)
# size 1024×1024
# MATLAB baseline: 1320 μs
I = rand(1024, 1024, 3)
img = collect(im_from_matlab(I))
@btime Gray.($img); # 982.269 μs (2 allocations: 8.00 MiB)
@btime collect(rgb2gray($I)); # 921.979 μs (13 allocations: 8.00 MiB)
@btime rgb2gray_direct($I); # 776.658 μs (2 allocations: 8.00 MiB) x = rand(256, 256, 3);
f = @() loop100(x);
timeit(f) * 1e4
x = rand(512, 512, 3);
f = @() loop100(x);
timeit(f) * 1e4
x = rand(1024, 1024, 3);
f = @() loop100(x);
timeit(f) * 1e4
function loop100(x)
for i = 1:100
rgb2gray(x);
end
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is an important addition and I'm very glad you've done it!
I think my main comment is that there may be many sources of "raw data" images that are not necessarily from Matlab. Consequently I wonder if we should (1) keep matlab out of the name, and (2) do we need to think about more general arrays? For instance, what about a 4D array that is an RGB movie? Is that (h, w, 3, t)
in size, or is it (h, w, t, 3)
in size? We may want to support both.
The other delicate issue is reinterpretation of UInt8 -> N0f8
. I agree this is what most people will want, but see below for some nuances.
# - `colorview(RGB, PermutedDimsArray(img, (3, 1, 2)))` | ||
# - `StructArray{RGB{eltype(img)}}(img; dims=3)` | ||
# Using `StructArray` preserves the information that original data is stored as SOA | ||
# layout, while `ReinterpretArray` cannot. For newer Julia versions, we interpret it as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# layout, while `ReinterpretArray` cannot. For newer Julia versions, we interpret it as | |
# layout, while `ReinterpretArray` (as created by `colorview`) cannot. For newer Julia versions, we interpret it as |
# Using `StructArray` preserves the information that original data is stored as SOA | ||
# layout, while `ReinterpretArray` cannot. For newer Julia versions, we interpret it as | ||
# `StructArray` and thus provides room for operator optimization, e.g., `imfilter` on | ||
# SOA layout can be implemented much easier and efficiently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# SOA layout can be implemented much easier and efficiently. | |
# SOA layout can be implemented more easily and efficiently. |
throw(ArgumentError(msg)) | ||
end | ||
end | ||
im_from_matlab(::Type{CT}, X::AbstractArray{UInt8}) where {CT} = _im_from_matlab(CT, reinterpret(N0f8, X)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we omit matlab
from the name of this function and call it something like colorview(RGB, YXC(X))
, should we still require that calling normview(X)
be a separate manual step? I agree that most users will want & expect UInt8
to become N0f8
, but I imagine a small number of users will actually want/mean Bool
(for binary images, i.e., all pixel intensities are 0 or 1 but stored with UInt8 storage).
Maybe one option is to use colorview(RGB{N0f8}, YXC(X))
to specify the re-interpretation, and make that the default for colorview(RGB, YXC(X))
, allowing users to specify colorview(RGB{Bool}, YXC(X))
when needed.
function im_from_matlab(::Type{CT}, X::AbstractArray{Int16}) where {CT} | ||
# MALTAB compat | ||
_im2double(x) = (Float64(x) + Float64(32768)) / Float64(65535) | ||
return _im_from_matlab(CT, mappedarray(_im2double, X)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will make the image read-only, unless you also supply the reverse transformation.
function _matlab_type_hint(@nospecialize X) | ||
mn, mx = extrema(X) | ||
if mn >= typemin(UInt8) && mx <= typemax(UInt8) | ||
return "For instance: `UInt8.(X)` or `X./$(typemax(UInt8))`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should also check isinteger(mn) && isinteger(mx)
or even all(isinteger, X)
before recommending UInt8.(X)
? (isinteger
checks the value, not the type, so it may be appropriate for an image that happens to be encoded with Float64
but for which all values are actually integers.)
I'm willing to pay this price. This seems like functionality that's too important to omit. We may find some load-time optimizations later, but the bottom line is that I would only balk if this made it something like 5s of load time rather than a "mere" doubling. |
This PR needs a rework in a new PR, but here are a few more edge cases that needs to update:
|
I have read through the StructArrays docs, PR changes and review comments and have a basic idea of what we are generally trying to do, should I continue on this here only or go further in another PR? |
Naming: my preferences are to call this something like |
There's also "N", as in:
I guess N could also be known as T (for time)? |
In relation to the discussions regarding ImageAnnotations on Slack etc., - and handling the memory format of an input image, in relation to the expected memory format of the input for some neural network: I was thinking along the lines of using AxisArrays, or just tuples of symbols, to communicate the meaning of the axes. Could this be used here instead of having separate HWC methods etc.? I guess the reason for separate methods is being easier on the compiler (not using a lot of Context (copied from stemann/ONNXRuntimeImageAnnotators.jl#1):
|
It's nice that this sort of naming makes the expected memory format explicit - so a user doesn't try to call a magic interpret function that ends up making the wrong call regarding the dimension order in an ambiguous case. But I'm also slightly worried about ambiguities... I guess "D"/depth was meant as the depth component of e.g. an RGB-D camera? Types of raw images I can think of - with a focus on raw camera output (using column-major notation, and calling the first spatial dimension W):
3D:
4D:
|
There are two separate issues:
The second issue is of huge importance: it's what has allowed us to use generic algorithms to provide wide coverage for so many operations for gray/color and multidimensional images. However, it does not necessarily follow that all code has to use this abstraction: it's perfectly fine for some methods to split out into color channels and do specialized things. In other words, what this PR is about is bridging these two worlds: providing a view consistent with no color channel while preserving memory layout favorable to algorithms that can be more efficiently written where color is a separate (and not the fastest) dimension of the array.
I was thinking of biomedical imaging techniques which are often natively 3D. I would distinguish an extra value D associated with a single X,Y position from an array where there is a D (or Z) axis to the array and intensity information available at every point. But I think N makes lots of sense too as a generic marker. For example in a convolutional network, the N outputs of a single layer. |
Another way to think about this core abstraction is to remember that, mathematically, an image is just a discretized version of a function mapping ℝᵐ to ℝⁿ. The In JuliaImages, the default representation is via an So the core abstraction in JuliaImages is to make sure that the default representation doesn't mix these things up. But that's independent of memory layout. |
Right! I also realised - post-commenting - that the the raw images (with channels), Bayer/bilinear/trilinear colour, and hyperspectral, images was off point (and the bilinear stuff was also wrong, I believe: One of the colour lines is inter-leaved red/blue while the other is all green - normally).
In this regard you are saying...?
Ah yes. I guess it would work just fine as well with an RGB-D camera image like from a Kinect - once the typical
I was more thinking of "N" as the number of images in a batch (or in a sequence/video) fed as input to a convolutional network... (as when the PyTorch world talks "NCHW"/"NHWC") 🤔 |
Good point. That's a case that's not quite as nicely solved by JuliaImages' abstractions. Or rather, I guess it is, but IMO for the clearest case But again, this representation is independent of memory layout. Suppose you have a NCHW array; this representation could be constructed something like (note: untested) using StructArrays, ArraysOfArrays
bigarray = rand(T, n, c, h, w)
intermediatearray = StructArray(RGB{T}; dims=2)
nicearray = nestedview(intermediatearray, 1) and all |
In abstract type AbstractStorageOrder{<:Colorant} end
abstract type HWC{T} <: AbstractStorageOrder{T} end with a generic function (with some* name):
*: I am still quite fond of IIUC, the |
I'm not even sure we need |
True, it might be an idea to not include "N" in that case - for this PR. But perhaps consider "T" and "D" instead - common combinations of HWC etc. with T and D.
Right! Nice example! :-)
Right. This echoes my thoughts on the "input image in one format" (raw array format) and "input for neural network in another format" (raw array format). We'll see if it will be a stumbling block sooner or later... in relation to stemann/ONNXRuntimeImageAnnotators.jl#1 But with something like the example above we can probably get far (enough). |
Yes, in that case I think |
Something like interpret/view would be nice. Layout is a good word, yes. Maybe |
Right. One associated thought - though I know a lot of algorithms in JuliaImages would not apply: If I just have an ... but one could also have used the notation L (for lambda) instead of C in this case - to not encourage trying to interpret C > 3 (or much higher than 3) as the normal |
That's a great example! You're right that wavelength-based representations muddy the waters; even RGB wanders into this territory more than, say, |
I mean, the WCH-image (or WLH with L for lambda) is an intensity-image, with values between 0 and 1 - and for each In the use case I have encountered L was 256 - but it could be any value >= 1 (just wouldn't be as hyper ... spectral). Might be related to what you hinted at in #170 (comment), though I can't quite grasp what you meant there... ("the |
I just meant that you don't have to stick with the types already defined in ColorTypes, it was designed to make it fairly easy to add new types. |
Some wrapper types would help, but I think what is actually needed here is an interface or protocol that can be used to describe the axes of "image" data. Note that this is a distinct but related problem to that being addressed by AxisArrays.jl and related packages. Rather than describing intrinsic properties of array data, here we need to describe an extrinsic property of how we want to interpret those axes. It is related to the AxisArrays.jl issue in that there may some natural default mappings between intrinsic axes to the image space. What I'm hearing here are two or three classes of axes.
|
I think most suites bundle alpha and color into a single axis, e.g., RGBA. We do that too, as an RGBA type. |
If we are talking about 2D, I agree. In 3D for volumetric or voxel rendering the situation may get more complicated. |
Sounds right - at least as some follow-up from this PR. To me this reads along the lines of what @timholy hinted at with “We probably need tools…” in #179 (comment) ?
Can’t quite follow the reasoning here. How is intrinsic different from extrinsic in this regard? (and then I didn’t get the rest either :-) ) |
Perhaps the rows and columns of a Now we want to display this as an image. So we map temperature to "X" or the horizontal axis and pressure to "Y" to the vertical axes. "X" and "Y" are extrinsic properties that we impose on the array based on how we want to interpret the data as an image. |
This provides a more convenient MATLAB interface compared to the composition of
PermutedDimsArray
,colorview
, etc (CRef #178)Although our naming guidelines recommend not adding
im_
prefix, I feelfrom_matlab
is somehow too generic a name, especially when the input is a raw numerical array. For this reason I use the long verbose nameim_from_matlab
, users could add their own alias if they want, e.g.,const im2jl = im_from_matlab
closed #170