downsample the image before clustering #71

johnnychen94 · 2022-04-28T07:34:39Z

For any real-world image, clustering the image pixel is a computational intensive big-data problem. For our very purpose to estimate the colormap, a downsampled
the version already works well.

using DitherPunk, TestImages, Clustering

img_gray = testimage("cameraman")
img_color = testimage("lighthouse")

@time dither(img_gray, FloydSteinberg(), 8)
# PR: 0.477710 seconds (30.36 M allocations: 559.455 MiB, 9.76% gc time)
# master: 0.693541 seconds (31.94 M allocations: 610.524 MiB, 7.57% gc time)

@time dither(img_color, FloydSteinberg(), 24)
# PR: 2.002484 seconds (120.32 M allocations: 1.937 GiB, 7.40% gc time)
# master: 8.085305 seconds (124.17 M allocations: 2.187 GiB, 2.26% gc time, 0.32% compilation time)

The results are not identical but the visual quality looks quite similar. I think this is a good improvement overall (not a breaking change).

codecov · 2022-04-28T07:39:38Z

Codecov Report

Merging #71 (c123651) into master (7969627) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #71      +/-   ##
==========================================
+ Coverage   98.44%   98.49%   +0.04%     
==========================================
  Files          14       14              
  Lines         257      265       +8     
==========================================
+ Hits          253      261       +8     
  Misses          4        4

Impacted Files	Coverage Δ
src/DitherPunk.jl	`100.00% <ø> (ø)`
src/clustering.jl	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7969627...c123651. Read the comment docs.

This commit also adds ImageBase dependency; a drop-in replacement for ImageCore with a few handy operators.

Project.toml

adrhill · 2022-04-28T20:02:38Z

src/clustering.jl

+    # Clustering on the downsampled image already generates good enough colormap estimation
+    # This significantly reduces the algorithmic complexity.
+    img = _restrict_to(img, ncolors*100)
+    data = convert(Array{Float32}, reshape(channelview(Lab.(img)), 3, :))


Is there a reason for the added conversion?

For many big-data problems, it's often a good strategy to eagerly collect data into a contiguous memory layout so that future slicing on data[inds...] can be faster.
I added this before trying the downsample trick only to verify if this helps reduce the allocations. Here I don't observe any performance difference, I guess it's because Clustering.jl is doing a similar thing on their side.

I've reverted this in the new commit.

adrhill

Looks great, good idea!

johnnychen94 changed the base branch from master to jc/ct April 28, 2022 07:47

johnnychen94 force-pushed the jc/opt branch from 8171f68 to 14da941 Compare April 28, 2022 07:47

johnnychen94 changed the title ~~a few improvements to error diffusion~~ downsample the image before clustering Apr 28, 2022

johnnychen94 changed the base branch from jc/ct to master April 28, 2022 08:14

johnnychen94 closed this Apr 28, 2022

johnnychen94 reopened this Apr 28, 2022

downsample the image before clustering

2436a72

This commit also adds ImageBase dependency; a drop-in replacement for ImageCore with a few handy operators.

johnnychen94 force-pushed the jc/opt branch from 14da941 to 2436a72 Compare April 28, 2022 08:50

johnnychen94 commented Apr 28, 2022

View reviewed changes

Project.toml Show resolved Hide resolved

adrhill reviewed Apr 28, 2022

View reviewed changes

adrhill approved these changes Apr 28, 2022

View reviewed changes

revert unnecessary changes

c123651

johnnychen94 merged commit f81ee0a into master Apr 29, 2022

johnnychen94 deleted the jc/opt branch April 29, 2022 03:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

downsample the image before clustering #71

downsample the image before clustering #71

johnnychen94 commented Apr 28, 2022 •

edited

Loading

codecov bot commented Apr 28, 2022 •

edited

Loading

adrhill Apr 28, 2022

johnnychen94 Apr 29, 2022

adrhill left a comment

downsample the image before clustering #71

downsample the image before clustering #71

Conversation

johnnychen94 commented Apr 28, 2022 • edited Loading

codecov bot commented Apr 28, 2022 • edited Loading

Codecov Report

adrhill Apr 28, 2022

Choose a reason for hiding this comment

johnnychen94 Apr 29, 2022

Choose a reason for hiding this comment

adrhill left a comment

Choose a reason for hiding this comment

johnnychen94 commented Apr 28, 2022 •

edited

Loading

codecov bot commented Apr 28, 2022 •

edited

Loading