Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add im_from_matlab and im_to_matlab #179

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

add im_from_matlab and im_to_matlab #179

wants to merge 12 commits into from

Conversation

johnnychen94
Copy link
Member

@johnnychen94 johnnychen94 commented Apr 11, 2022

This provides a more convenient MATLAB interface compared to the composition of PermutedDimsArray, colorview, etc (CRef #178)

Although our naming guidelines recommend not adding im_ prefix, I feel from_matlab is somehow too generic a name, especially when the input is a raw numerical array. For this reason I use the long verbose name im_from_matlab, users could add their own alias if they want, e.g., const im2jl = im_from_matlab

closed #170

src/matlab.jl Outdated
function _im_from_matlab(::Type{CT}, X::AbstractArray{T}) where {CT<:Colorant, T<:Real}
_CT = isconcretetype(CT) ? CT : base_colorant_type(CT){T}
# FIXME(johnnychen94): not type inferrable here
return StructArray{_CT}(X; dims=3)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of colorview(RGB, PermutedDimsArray(X, (3, 1, 2))) which losses the information that memory layout is struct of array, this PR introduces StructArrays so that other functions have the potential to provide an optimized implementation for this memory layout.

@codecov
Copy link

codecov bot commented Apr 11, 2022

Codecov Report

Merging #179 (251fb2d) into master (9e64fec) will decrease coverage by 1.20%.
The diff coverage is 85.33%.

@@            Coverage Diff             @@
##           master     #179      +/-   ##
==========================================
- Coverage   62.60%   61.39%   -1.21%     
==========================================
  Files          10       11       +1     
  Lines         583      658      +75     
==========================================
+ Hits          365      404      +39     
- Misses        218      254      +36     
Impacted Files Coverage Δ
src/ImageCore.jl 72.41% <ø> (ø)
src/matlab.jl 85.33% <85.33%> (ø)
src/convert_reinterpret.jl 26.66% <0.00%> (-33.34%) ⬇️
src/show.jl 50.00% <0.00%> (-20.84%) ⬇️
src/colorchannels.jl 47.18% <0.00%> (-3.53%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e64fec...251fb2d. Read the comment docs.

@johnnychen94 johnnychen94 force-pushed the jc/matlab branch 2 times, most recently from 638a296 to 68e46bc Compare April 11, 2022 10:38
@johnnychen94 johnnychen94 marked this pull request as ready for review April 11, 2022 10:38
@johnnychen94
Copy link
Member Author

johnnychen94 commented Apr 14, 2022

TODO: support Matlab indexed image format

  • im_from_matlab(index, values)
  • im_to_matlab(::IndirectArray)

@johnnychen94
Copy link
Member Author

johnnychen94 commented Apr 15, 2022

TODO:

  • im_to_matlab(UInt8, n0f8_data)
  • im_to_matlab(img) with offseted image.

@timholy timholy self-requested a review April 16, 2022 10:24
@johnnychen94
Copy link
Member Author

johnnychen94 commented Apr 17, 2022

@timholy I notice you've self-assigned a review. This PR is basically ready for review, except that I still need to 1) revisit and improve the docstring, and 2) rebase the commits.

In a summary, this PR:

  • provides im_to_matlab and im_from_matlab to support lazy conversion from/to matlab's layout
  • introduce StructArray to wrap the struct of array layout
  • introduce IndirectArray to represent the indexed image

The loading time, however, increases from 0.58 seconds to 0.98 seconds even in the very decent i9-12900K CPU, among the time StructArray contributes 0.32 seconds. This is not ideal, what's your thought?


This PR originates from the need to provide API compatibility to MATLAB's toolbox. im_to_matlab and im_from_matlab are something so basic that I think worth living in ImageCore.
As far as I know, forking the API and layout is no longer an issue given that Google has won the java API lawsuit. Thus I mimic the MATLAB's image layout convention:

When converting from MATLAB layout, because we don't have colorspace information, we have to guess from the input:

  • RGB image is a 3-dimensional numerical array with the color channel in the last dimension
  • Gray image is a 2-dimensional numerical array
  • other colorspace needs explicit conversion of type annotation

when converting to MATLAB layout:

  • similar to the above except that it supports arbitrary dimension...
  • another colorspace will be converted to RGB first
  • forcing 1-based indexing. JuliaImages supports arbitrary indexing offsets but MATLAB layout typically uses 1-based indexing. Again, this is because I want to make this MATLAB-tailored tool as convenient as possible.

As an example, here's the benchmark between various versions of the trivial rgb2gray implementation on different sizes:

  • direct (AoS): Gray.(img) where img = rand(RGB{Float64}, sz)
  • direct (SoA): @. @views 0.2989*I[:,:,1]+0.5870*I[:,:,2]+0.1140*I[:,:,3]
  • indirect (SoA): im_to_matlab(of_eltype(Gray, im_from_matlab(I)))
  • MATLAB (SoA): rgb2gray

Conclusions:

  • It's interesting to see that "indirect (SoA)" method is sometimes even faster than the "direct (AoS)" method.
  • The performance overhead of im_to_matlab/im_from_matlab is relatively small.
  • Except for the small image case, we're better than MATLAB.

Size (256, 256)

version time
direct (AoS) 32.042 μs
direct (SoA) 33.386 μs
indirect (SoA) 38.963 μs
MATLAB (SoA) 23.170 μs

Size (512, 512)

version time
direct (AoS) 143.831 μs
direct (SoA) 151.395 μs
indirect (SoA) 155.726 μs
MATLAB (SoA) 546.576 μs

Size (1024, 1024)

version time
direct (AoS) 982.269 μs
direct (SoA) 776.658 μs
indirect (SoA) 921.979 μs
MATLAB (SoA) 1320 μs
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: 12th Gen Intel(R) Core(TM) i9-12900K
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, goldmont)
scripts
using ImageCore, MappedArrays, BenchmarkTools

rgb2gray(I) = im_to_matlab(of_eltype(Gray, im_from_matlab(I)))
function rgb2gray_direct(I)
    @. @views 0.2989 * I[:, :, 1] + 0.5870 * I[:, :, 2] + 0.1140 * I[:, :, 3]
end

# size 256×256
# MATLAB baseline: 23.170 μs
I = rand(256, 256, 3)
img = collect(im_from_matlab(I))
@btime Gray.($img); # 32.042 μs (2 allocations: 512.05 KiB)
@btime collect(rgb2gray($I)); # 38.963 μs (13 allocations: 512.83 KiB)
@btime rgb2gray_direct($I); # 33.386 μs (2 allocations: 512.05 KiB)

# size 512×512
# MATLAB baseline: 546.576 μs
I = rand(512, 512, 3)
img = collect(im_from_matlab(I))
@btime Gray.($img); # 143.831 μs (2 allocations: 2.00 MiB)
@btime collect(rgb2gray($I)); # 155.726 μs (13 allocations: 2.00 MiB)
@btime rgb2gray_direct($I); # 151.395 μs (2 allocations: 2.00 MiB)

# size 1024×1024
# MATLAB baseline: 1320 μs
I = rand(1024, 1024, 3)
img = collect(im_from_matlab(I))
@btime Gray.($img); # 982.269 μs (2 allocations: 8.00 MiB)
@btime collect(rgb2gray($I)); # 921.979 μs (13 allocations: 8.00 MiB)
@btime rgb2gray_direct($I); # 776.658 μs (2 allocations: 8.00 MiB)
x = rand(256, 256, 3);
f = @() loop100(x);
timeit(f) * 1e4

x = rand(512, 512, 3);
f = @() loop100(x);
timeit(f) * 1e4

x = rand(1024, 1024, 3);
f = @() loop100(x);
timeit(f) * 1e4


function loop100(x)
    for i = 1:100
        rgb2gray(x);
    end
end

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an important addition and I'm very glad you've done it!

I think my main comment is that there may be many sources of "raw data" images that are not necessarily from Matlab. Consequently I wonder if we should (1) keep matlab out of the name, and (2) do we need to think about more general arrays? For instance, what about a 4D array that is an RGB movie? Is that (h, w, 3, t) in size, or is it (h, w, t, 3) in size? We may want to support both.

The other delicate issue is reinterpretation of UInt8 -> N0f8. I agree this is what most people will want, but see below for some nuances.

# - `colorview(RGB, PermutedDimsArray(img, (3, 1, 2)))`
# - `StructArray{RGB{eltype(img)}}(img; dims=3)`
# Using `StructArray` preserves the information that original data is stored as SOA
# layout, while `ReinterpretArray` cannot. For newer Julia versions, we interpret it as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# layout, while `ReinterpretArray` cannot. For newer Julia versions, we interpret it as
# layout, while `ReinterpretArray` (as created by `colorview`) cannot. For newer Julia versions, we interpret it as

# Using `StructArray` preserves the information that original data is stored as SOA
# layout, while `ReinterpretArray` cannot. For newer Julia versions, we interpret it as
# `StructArray` and thus provides room for operator optimization, e.g., `imfilter` on
# SOA layout can be implemented much easier and efficiently.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# SOA layout can be implemented much easier and efficiently.
# SOA layout can be implemented more easily and efficiently.

throw(ArgumentError(msg))
end
end
im_from_matlab(::Type{CT}, X::AbstractArray{UInt8}) where {CT} = _im_from_matlab(CT, reinterpret(N0f8, X))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we omit matlab from the name of this function and call it something like colorview(RGB, YXC(X)), should we still require that calling normview(X) be a separate manual step? I agree that most users will want & expect UInt8 to become N0f8, but I imagine a small number of users will actually want/mean Bool (for binary images, i.e., all pixel intensities are 0 or 1 but stored with UInt8 storage).

Maybe one option is to use colorview(RGB{N0f8}, YXC(X)) to specify the re-interpretation, and make that the default for colorview(RGB, YXC(X)), allowing users to specify colorview(RGB{Bool}, YXC(X)) when needed.

function im_from_matlab(::Type{CT}, X::AbstractArray{Int16}) where {CT}
# MALTAB compat
_im2double(x) = (Float64(x) + Float64(32768)) / Float64(65535)
return _im_from_matlab(CT, mappedarray(_im2double, X))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make the image read-only, unless you also supply the reverse transformation.

function _matlab_type_hint(@nospecialize X)
mn, mx = extrema(X)
if mn >= typemin(UInt8) && mx <= typemax(UInt8)
return "For instance: `UInt8.(X)` or `X./$(typemax(UInt8))`"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should also check isinteger(mn) && isinteger(mx) or even all(isinteger, X) before recommending UInt8.(X)? (isinteger checks the value, not the type, so it may be appropriate for an image that happens to be encoded with Float64 but for which all values are actually integers.)

@timholy
Copy link
Member

timholy commented Apr 22, 2022

The loading time, however, increases from 0.58 seconds to 0.98 seconds even in the very decent i9-12900K CPU, among the time StructArray contributes 0.32 seconds. This is not ideal, what's your thought?

I'm willing to pay this price. This seems like functionality that's too important to omit. We may find some load-time optimizations later, but the bottom line is that I would only balk if this made it something like 5s of load time rather than a "mere" doubling.

@johnnychen94
Copy link
Member Author

johnnychen94 commented Apr 26, 2022

This PR needs a rework in a new PR, but here are a few more edge cases that needs to update:

  • im_to_matlab(img_n0f8) should output array of eltype UInt8: currently it outputs N0f8 but it would make it harder to keep consistant with the MATLAB's convention
  • im_to_matlab(im_to_matlab(img)) currently doesn't work
  • indexed image: workaround the 1-based index or 0-based index difference
  • even though there is no grayscale indexed image, it's convenient to support im_from_matlab(index, values::AbstractVector) case.

@ashwanirathee
Copy link
Member

I have read through the StructArrays docs, PR changes and review comments and have a basic idea of what we are generally trying to do, should I continue on this here only or go further in another PR?

@timholy
Copy link
Member

timholy commented Aug 10, 2023

Naming: my preferences are to call this something like HWC{RGB}(rawimg) (for "height-width-color"). We could also have WHC, CHW, CWH, HWDC (D = "depth"), etc.

@stemann
Copy link

stemann commented Aug 13, 2023

Naming: my preferences are to call this something like HWC{RGB}(rawimg) (for "height-width-color"). We could also have WHC, CHW, CWH, HWDC (D = "depth"), etc.

There's also "N", as in:

I guess N could also be known as T (for time)?

@stemann
Copy link

stemann commented Aug 13, 2023

In relation to the discussions regarding ImageAnnotations on Slack etc., - and handling the memory format of an input image, in relation to the expected memory format of the input for some neural network: I was thinking along the lines of using AxisArrays, or just tuples of symbols, to communicate the meaning of the axes.

Could this be used here instead of having separate HWC methods etc.? I guess the reason for separate methods is being easier on the compiler (not using a lot of Val{Symbol}'s)?

Context (copied from stemann/ONNXRuntimeImageAnnotators.jl#1):

I had the following on my mind (just for the CWH case - disregarding CWHN for now):

annotate(image::AbstractArray{<:Colorant,2}, annotator)

could assume that the standard memory format of image is used and therefore dispatch to

annotate(image::AxisArray{<:AbstractArray{<:Colorant,2}, :w, :h}, annotator)

and then among the model metadata (in annotator) there could be the information that the neural network expects e.g. (:c, :w, :h, :n). Using strictly column-major notation. In this specific case the input, with format (:c, :w, :h) (after channelview), would not require conversion to match (:c, :w, :h, :n).

Examples of expected input for ONNX models for vision

  • ResNet: NCHW / WHCN
  • YOLOv3 and YOLOv4: 1x3x416x416 - so I guess that would be NCHW / WHCN (channels first/inter-leaved)
  • YOLOv7: I believe it is using NHWC / CWHN

@stemann
Copy link

stemann commented Aug 13, 2023

Naming: my preferences are to call this something like HWC{RGB}(rawimg) (for "height-width-color"). We could also have WHC, CHW, CWH, HWDC (D = "depth"), etc.

It's nice that this sort of naming makes the expected memory format explicit - so a user doesn't try to call a magic interpret function that ends up making the wrong call regarding the dimension order in an ambiguous case.

But I'm also slightly worried about ambiguities... I guess "D"/depth was meant as the depth component of e.g. an RGB-D camera?

Types of raw images I can think of - with a focus on raw camera output (using column-major notation, and calling the first spatial dimension W):
2D:

3D:

  • Sequence/batch of monochrome (gray) images or Bayer-mask colour images: WHN
  • RGB image: CWH
  • Multiple lines from a colour line scan camera (prior to conversion to RGB): WCH (with C=2 or C=3)
  • Multiple lines from a hyperspectral line scan camera: WCH

4D:

  • Sequence/batch of RGB images: CWHN
  • Sequence/batch from colour/hyperspectral line scan camera: WCHN

@timholy
Copy link
Member

timholy commented Aug 13, 2023

There are two separate issues:

  • disambiguating the meaning of axes: ImageAxes/AxisArrays (or whatever its successor turns out to be) is not only viable, but irreplaceable if you need to give meaning to certain axis types (like distinguishing spatial vs temporal axes).
  • constructing "views" that support one of JuliaImages' core abstractions: each array index corresponds to the full spatiotemporal information for a single pixel in a single image. That means there isn't a color channel in that view, because color channels break that abstraction.

The second issue is of huge importance: it's what has allowed us to use generic algorithms to provide wide coverage for so many operations for gray/color and multidimensional images. However, it does not necessarily follow that all code has to use this abstraction: it's perfectly fine for some methods to split out into color channels and do specialized things.

In other words, what this PR is about is bridging these two worlds: providing a view consistent with no color channel while preserving memory layout favorable to algorithms that can be more efficiently written where color is a separate (and not the fastest) dimension of the array.

I guess "D"/depth was meant as the depth component of e.g. an RGB-D camera?

I was thinking of biomedical imaging techniques which are often natively 3D. I would distinguish an extra value D associated with a single X,Y position from an array where there is a D (or Z) axis to the array and intensity information available at every point.

But I think N makes lots of sense too as a generic marker. For example in a convolutional network, the N outputs of a single layer.

@timholy
Copy link
Member

timholy commented Aug 13, 2023

Another way to think about this core abstraction is to remember that, mathematically, an image is just a discretized version of a function mapping ℝᵐ to ℝⁿ. The m corresponds to all the spatio-temporal axes, i.e., the domain over which you have data. The n corresponds to the amount of data you get at each one of these locations. So a RGBD camera maps ℝ² to ℝ⁴, because you get 4 pieces of information at each 2d point. Conversely, a movie as you might watch on TV or in a theater maps ℝ³ to ℝ³, because HWT gets mapped to RGB.

In JuliaImages, the default representation is via an m-dimensional array of n-component values. Most other image processing suites represent this instead as an m+n dimensional array, i.e., over the product space ℝᵐ⊗ℝⁿ. But this adds a lot of complexity to algorithms: for example, a mean-shift algorithm like in ImageSegmentation requires that you compute a total distance between two pixels, where totaldistance = spatialdistance + colordistance. spatialdistance gets computed from the difference in indices for the two spatial locations, whereas colordistance gets computed from the difference in values assigned to the two locations. This is trivial to implement if you represent the image as an m-dimensional array of n-component values, and becomes a lot harder if you have a single m+n-dimensional array which may have different meanings to the different axes. There are many other examples, including iterating over neighboring pixels, etc.

So the core abstraction in JuliaImages is to make sure that the default representation doesn't mix these things up. But that's independent of memory layout.

@stemann
Copy link

stemann commented Aug 13, 2023

There are two separate issues:

  1. constructing "views" that support one of JuliaImages' core abstractions: each array index corresponds to the full spatiotemporal information for a single pixel in a single image. That means there isn't a color channel in that view, because color channels break that abstraction.

The second issue is of huge importance: it's what has allowed us to use generic algorithms to provide wide coverage for so many operations for gray/color and multidimensional images. However, it does not necessarily follow that all code has to use this abstraction: it's perfectly fine for some methods to split out into color channels and do specialized things.

In other words, what this PR is about is bridging these two worlds: providing a view consistent with no color channel while preserving memory layout favorable to algorithms that can be more efficiently written where color is a separate (and not the fastest) dimension of the array.

Right! I also realised - post-commenting - that the the raw images (with channels), Bayer/bilinear/trilinear colour, and hyperspectral, images was off point (and the bilinear stuff was also wrong, I believe: One of the colour lines is inter-leaved red/blue while the other is all green - normally).

  1. disambiguating the meaning of axes: ImageAxes/AxisArrays (or whatever its successor turns out to be) is not only viable, but irreplaceable if you need to give meaning to certain axis types (like distinguishing spatial vs temporal axes).

In this regard you are saying...?

I guess "D"/depth was meant as the depth component of e.g. an RGB-D camera?

I was thinking of biomedical imaging techniques which are often natively 3D. I would distinguish an extra value D associated with a single X,Y position from an array where there is a D (or Z) axis to the array and intensity information available at every point.

Ah yes.

I guess it would work just fine as well with an RGB-D camera image like from a Kinect - once the typical (colour image, point cloud) output has been aligned/registered to e.g. enrich the colour image with depth data.

But I think N makes lots of sense too as a generic marker. For example in a convolutional network, the N outputs of a single layer.

I was more thinking of "N" as the number of images in a batch (or in a sequence/video) fed as input to a convolutional network... (as when the PyTorch world talks "NCHW"/"NHWC") 🤔

@timholy
Copy link
Member

timholy commented Aug 13, 2023

I was more thinking of "N" as the number of images in a batch (or in a sequence/video) fed as input to a convolutional network... (as when the PyTorch world talks "NCHW"/"NHWC")

Good point. That's a case that's not quite as nicely solved by JuliaImages' abstractions. Or rather, I guess it is, but IMO for the clearest case N would be an "outer" index and HW an "inner" index. I.e., images[k][i, j] would return the RGB value associated with spatial location i,j in the kth image. Having it be doubly-indexed clarifies the concept that the different images are not necessarily related to one another in the same manner as, say, a time series or XYZ image, for which the concept of "adjacent" in T or Z is genuinely meaningful.

But again, this representation is independent of memory layout. Suppose you have a NCHW array; this representation could be constructed something like (note: untested)

using StructArrays, ArraysOfArrays
bigarray = rand(T, n, c, h, w)
intermediatearray = StructArray(RGB{T}; dims=2)
nicearray = nestedview(intermediatearray, 1)

and all nicearray does is wrap bigarray with no additional memory allocation. But you might need special dispatch if you want to take advantage of this memory layout for efficient algorithms. We probably need tools that take possibly-nested array wrappers and return a fully-unwrapped raw array together with axis data indicating the meaning of each.

@stemann
Copy link

stemann commented Aug 13, 2023

In HWC{RGB} etc., HWC looks a bit like a type, e.g. a suggestion could be (maybe this was also your intent?):

abstract type AbstractStorageOrder{<:Colorant} end

abstract type HWC{T} <: AbstractStorageOrder{T} end

with a generic function (with some* name):

interpret_raw(T::Type{<:AbstractStorageOrder{<:Colorant}}, raw_img)

*: I am still quite fond of channelview(C, normedview(T, raw_img)), so it would be nice if these convenience versions for "complicated layouts" were not too different from channelview: IIUC, the StructArray-constructor would still yield a view of the raw data - in that case it should be apparent from the function name that no conversion/copying is done.

IIUC, the channelview and normedview would remain unchanged, for when a StructArray representation is not necessary.

@timholy
Copy link
Member

timholy commented Aug 13, 2023

I'm not even sure we need interpret_raw, although I guess you're right that my syntax HWC{RGB}(rawimg) seems to imply that there's an HWC type and one can construct HWC instances (rather than, e.g., creating a StructArray). So maybe interpret_raw (or interpret_layout?) is better.

@stemann
Copy link

stemann commented Aug 13, 2023

I was more thinking of "N" as the number of images in a batch (or in a sequence/video) fed as input to a convolutional network... (as when the PyTorch world talks "NCHW"/"NHWC")

Good point. That's a case that's not quite as nicely solved by JuliaImages' abstractions. Or rather, I guess it is, but IMO for the clearest case N would be an "outer" index and HW an "inner" index. I.e., images[k][i, j] would return the RGB value associated with spatial location i,j in the kth image. Having it be doubly-indexed clarifies the concept that the different images are not necessarily related to one another in the same manner that, say, a time series or XYZ image is.

True, it might be an idea to not include "N" in that case - for this PR.

But perhaps consider "T" and "D" instead - common combinations of HWC etc. with T and D.

But again, this representation is independent of memory layout. Suppose you have a NCHW array; this representation could be constructed something like (note: untested)

using StructArrays, ArraysOfArrays
bigarray = rand(T, n, c, h, w)
intermediatearray = StructArray(RGB{T}; dims=2)
nicearray = nestedview(intermediatearray, 1)

and all nicearray does is wrap bigarray with no additional memory allocation.

Right! Nice example! :-)

But you might need special dispatch if you want to take advantage of this memory layout for efficient algorithms. We probably need tools that take possibly-nested array wrappers and return a fully-unwrapped raw array together with axis data indicating the meaning of each.

Right. This echoes my thoughts on the "input image in one format" (raw array format) and "input for neural network in another format" (raw array format). We'll see if it will be a stumbling block sooner or later... in relation to stemann/ONNXRuntimeImageAnnotators.jl#1

But with something like the example above we can probably get far (enough).

@timholy
Copy link
Member

timholy commented Aug 13, 2023

But perhaps consider "T" and "D" instead - common combinations of HWC etc. with T and D.

Yes, in that case I think img[i, j, k] is strongly preferred. The "discrete representation of a function" core concept helps make such decisions: when that 3rd axis is Z or T, you can imagine that it would have been possible to sample more densely and acquire values at intermediate positions/times in Z or T. But with N, it's just a "bag of images" and there is no sense in which you could interpolate between them.

@stemann
Copy link

stemann commented Aug 13, 2023

I'm not even sure we need interpret_raw, although I guess you're right that my syntax HWC{RGB}(rawimg) seems to imply that there's an HWC type and one can construct HWC instances (rather than, e.g., creating a StructArray). So maybe interpret_raw (or interpret_layout?) is better.

Something like interpret/view would be nice. Layout is a good word, yes. Maybe layoutview? (with the top abstract type named something with Abstract and Layout - and maybe something with Image).

@stemann
Copy link

stemann commented Aug 13, 2023

But perhaps consider "T" and "D" instead - common combinations of HWC etc. with T and D.

Yes, in that case I think img[i, j, k] is strongly preferred. The "discrete representation of a function" core concept helps make such decisions: when that 3rd axis is Z or T, you can imagine that it would have been possible to sample more densely and acquire values at intermediate positions/times in Z or T. But with N, it's just a "bag of images" and there is no sense in which you could interpolate between them.

Right.

One associated thought - though I know a lot of algorithms in JuliaImages would not apply: If I just have an a = Array{UInt16, 3} from a 10-bit hyperspectral camera, then normedview(N4f10, a) will give me an image-like view with dimensions WCH where C is also a discretisation of a continuous space (wavelength/frequency)... (kind of getting "half of JuliaImages"... - the core abstraction(s) :-)

... but one could also have used the notation L (for lambda) instead of C in this case - to not encourage trying to interpret C > 3 (or much higher than 3) as the normal Colorants...

@timholy
Copy link
Member

timholy commented Aug 13, 2023

That's a great example! You're right that wavelength-based representations muddy the waters; even RGB wanders into this territory more than, say, LUV or similar where the different channels in no way represent a discretization.

@stemann
Copy link

stemann commented Aug 13, 2023

I mean, the WCH-image (or WLH with L for lambda) is an intensity-image, with values between 0 and 1 - and for each c in C, you have a Monochrome-image - like a stack of Gray-images - continuous over L.

In the use case I have encountered L was 256 - but it could be any value >= 1 (just wouldn't be as hyper ... spectral).

Might be related to what you hinted at in #170 (comment), though I can't quite grasp what you meant there... ("the @makealpha macro was missing from the example...?)

@timholy
Copy link
Member

timholy commented Aug 13, 2023

I just meant that you don't have to stick with the types already defined in ColorTypes, it was designed to make it fairly easy to add new types.

@mkitti
Copy link
Member

mkitti commented Aug 14, 2023

Some wrapper types would help, but I think what is actually needed here is an interface or protocol that can be used to describe the axes of "image" data.

Note that this is a distinct but related problem to that being addressed by AxisArrays.jl and related packages. Rather than describing intrinsic properties of array data, here we need to describe an extrinsic property of how we want to interpret those axes.

It is related to the AxisArrays.jl issue in that there may some natural default mappings between intrinsic axes to the image space.

What I'm hearing here are two or three classes of axes.

  1. Image space (not necessarily the physical spatial dimensions)
  2. Colorant
  3. Composition / Rendering (e.g. alpha transparency,

@timholy
Copy link
Member

timholy commented Aug 14, 2023

I think most suites bundle alpha and color into a single axis, e.g., RGBA. We do that too, as an RGBA type.

@mkitti
Copy link
Member

mkitti commented Aug 14, 2023

If we are talking about 2D, I agree. In 3D for volumetric or voxel rendering the situation may get more complicated.

@stemann
Copy link

stemann commented Aug 14, 2023

Some wrapper types would help, but I think what is actually needed here is an interface or protocol that can be used to describe the axes of "image" data.

Sounds right - at least as some follow-up from this PR. To me this reads along the lines of what @timholy hinted at with “We probably need tools…” in #179 (comment) ?

Note that this is a distinct but related problem to that being addressed by AxisArrays.jl and related packages. Rather than describing intrinsic properties of array data, here we need to describe an extrinsic property of how we want to interpret those axes.

Can’t quite follow the reasoning here. How is intrinsic different from extrinsic in this regard?

(and then I didn’t get the rest either :-) )

@mkitti
Copy link
Member

mkitti commented Aug 15, 2023

Can’t quite follow the reasoning here. How is intrinsic different from extrinsic in this regard?

Perhaps the rows and columns of a Matrix correspond to distinct temperatures and pressures. These are intrinsic to the array data.

Now we want to display this as an image. So we map temperature to "X" or the horizontal axis and pressure to "Y" to the vertical axes. "X" and "Y" are extrinsic properties that we impose on the array based on how we want to interpret the data as an image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

introduce StructArrays
5 participants