Consider rewriting pieces of the Crimisini algorithm for greater performance #15

juliohm · 2020-07-02T16:52:02Z

Some parts of the algorithm could possibly be rewritten to improve performance. Specifically, I think that these two lines where we are finding the frontier of the mask could be replaced by a simple graph representation as opposed to performing dilation + subtraction:

ImageInpainting.jl/src/criminisi.jl

Line 56 in d776c4b

δΩ = findall(dilate(ϕ) .& .!ϕ)

ImageInpainting.jl/src/criminisi.jl

Line 94 in d776c4b

δΩ = findall(dilate(ϕ) .& .!ϕ)

The dilation + subtraction are great to make the code very clean, but I guess too expensive compared to just doing a graph search on the pixels. We would have to benchmark. Just saving this idea here as an issue in case someone wants to give a try.

ashwanirathee · 2021-04-01T11:30:11Z

@juliohm I would like to do give this a try,tried this on example provided by @mgautam98

using ImageInpainting, TestImages, ImageCore

img  = Float64.(Gray.(testimage("lighthouse")));

Gray.(img)

# Create a mask of same size as of image
mask = falses(size(img));

# pixels where mask is set to true will be inpainted
mask[50:350,300:400] .= true;
Gray.(mask)

# inpaint take the parametes image, mask and algorithm.  
# We will be using Criminisi [1] algorithm
out = inpaint(img, mask, Criminisi(30,30));

In line 94,
δΩ = findall(dilate(ϕ) .& .!ϕ)
In this case,its run like 151 times..consuming 1.41sec in total and 0.009363451 on average
For others,we can see below for 1 round

while !isempty(δΩ)
    # update confidence values in frontier
    @time for p in δΩ
      c = selectpatch(C, patchsize, p)
      b = selectpatch(ϕ, patchsize, p)
      C[p] = sum(c[b]) / prod(patchsize)
    end  #0.009453 seconds (800 allocations: 1.513 MiB)

    # isophote map
    @time ∇I = pointgradients(I, δΩ) * R     # 0.000932 seconds (3.25 k allocations: 3.642 MiB)
    @time nϕ = pointgradients(ϕ, δΩ) #  0.001477 seconds (3.25 k allocations: 380.266 KiB)
    @time D  = vec(abs.(sum(∇I .* nϕ, dims=2)))
    @time D /= maximum(D)

    # select patch in frontier
    @time p  = δΩ[argmax(C[δΩ].*D)] # 0.000013 seconds (2 allocations: 9.625 KiB)
    @time ψₚ = selectpatch(I, patchsize, p)
    @time bₚ = selectpatch(ϕ, patchsize, p)
   
    # compute distance to all other patches
    @time Δ = convdist(I, ψₚ, bₚ) #  0.086936 seconds (357 allocations: 81.306 MiB, 13.39% gc time)


    # only consider patches in filled region
    @time Δ[mask] .= Inf

  0.000078 seconds (4 allocations: 237.719 KiB)

    # select best candidate
    @time q = argmin(Δ)  # 0.002975 seconds
    @time ψᵦ = selectpatch(I, patchsize, q)
    @time bᵦ = selectpatch(ϕ, patchsize, q)

    # paste patch and mark pixels as painted
    @time b = bᵦ .& .!bₚ
    @time ψₚ[b] .= ψᵦ[b]
    @time bₚ[b] .= true

    # update frontier
    @time δΩ = findall(dilate(ϕ) .& .!ϕ) #0.010699 seconds (15 allocations: 118.562 KiB)
    count =count+1
    break;
  end

I want to improve this to complete this:JuliaImages/juliaimages.github.io#138

juliohm · 2021-04-01T11:50:49Z

Thank you @ashwani-rathee , that would be great. Can you provide the time output with https://github.com/KristofferC/TimerOutputs.jl? That would give us a better and cleaner perspective on what needs to be improved the most.

babaq · 2021-09-03T18:31:14Z

it would be great if it runs faster. I tried with a 2080*2080 1 channel image, and it seems run forever, until I stopped after waiting around 10min. If I reduce the size to 500*500, it needs around 10min to finish. So for my case, it's not practically useful. I don't know how much it can improve by better implementations, or how good the Crimisini scales?

juliohm · 2021-09-03T19:11:26Z

Yes, if someone can prepare a PR with performance improvements I will be happy to review it. Currently too busy with other projects to work on it.

evetion · 2022-06-14T09:13:20Z

It seems that most time is spent in calculating the distance function for finding a relevant patch. If there are gains, it is in writing non-allocating versions, but even that won't speed up the already quick FFT by much I fear.

julia> @time out = inpaint(img, mask, Criminisi(30, 30));
 ────────────────────────────────────────────────────────────────────
                            Time                    Allocations
                   ───────────────────────   ────────────────────────
 Tot / % measured:      13.1s /  97.1%           12.6GiB /  99.5%

 Section   ncalls     time    %tot     avg     alloc    %tot      avg
 ────────────────────────────────────────────────────────────────────
 Δ            151    10.9s   85.9%  72.4ms   11.5GiB   92.0%  78.3MiB
 δΩ           152    799ms    6.3%  5.26ms   7.92MiB    0.1%  53.4KiB
 p            151    560ms    4.4%  3.71ms    433MiB    3.4%  2.87MiB
 nϕ           151    281ms    2.2%  1.86ms   48.4MiB    0.4%   328KiB
 ∇I           151    152ms    1.2%  1.00ms    542MiB    4.2%  3.59MiB
 ψᵦ           151    297μs    0.0%  1.97μs     0.00B    0.0%    0.00B
 bᵦ           151    264μs    0.0%  1.75μs     0.00B    0.0%    0.00B
 ────────────────────────────────────────────────────────────────────

juliohm · 2022-06-14T09:52:40Z

Yes, probably the best way to speedup this code is add a CUDA implementation of the distance function. Both ImageQuilting.jl and ImageInpainting.jl need this. Em ter., 14 de jun. de 2022 06:13, Maarten Pronk ***@***.***> escreveu:

…

It seems that most time is spent in calculating the distance function for finding a relevant patch. If there are gains, it is in writing non-allocating versions, but even that won't speed up the already quick FFT by much I fear. julia> @time out = inpaint(img, mask, Criminisi(30, 30)); ──────────────────────────────────────────────────────────────────── Time Allocations ─────────────────────── ──────────────────────── Tot / % measured: 13.1s / 97.1% 12.6GiB / 99.5% Section ncalls time %tot avg alloc %tot avg ──────────────────────────────────────────────────────────────────── Δ 151 10.9s 85.9% 72.4ms 11.5GiB 92.0% 78.3MiB δΩ 152 799ms 6.3% 5.26ms 7.92MiB 0.1% 53.4KiB p 151 560ms 4.4% 3.71ms 433MiB 3.4% 2.87MiB nϕ 151 281ms 2.2% 1.86ms 48.4MiB 0.4% 328KiB ∇I 151 152ms 1.2% 1.00ms 542MiB 4.2% 3.59MiB ψᵦ 151 297μs 0.0% 1.97μs 0.00B 0.0% 0.00B bᵦ 151 264μs 0.0% 1.75μs 0.00B 0.0% 0.00B ──────────────────────────────────────────────────────────────────── — Reply to this email directly, view it on GitHub <#15 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZQW3I6PISD54XLJBWRJH3VPBELXANCNFSM4OPCFFNA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

juliohm added enhancement New feature or request good first issue Good for newcomers labels Jul 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider rewriting pieces of the Crimisini algorithm for greater performance #15

Consider rewriting pieces of the Crimisini algorithm for greater performance #15

juliohm commented Jul 2, 2020

ashwanirathee commented Apr 1, 2021

juliohm commented Apr 1, 2021

babaq commented Sep 3, 2021 •

edited

Loading

juliohm commented Sep 3, 2021

evetion commented Jun 14, 2022

juliohm commented Jun 14, 2022 via email

Consider rewriting pieces of the Crimisini algorithm for greater performance #15

Consider rewriting pieces of the Crimisini algorithm for greater performance #15

Comments

juliohm commented Jul 2, 2020

ashwanirathee commented Apr 1, 2021

juliohm commented Apr 1, 2021

babaq commented Sep 3, 2021 • edited Loading

juliohm commented Sep 3, 2021

evetion commented Jun 14, 2022

juliohm commented Jun 14, 2022 via email

babaq commented Sep 3, 2021 •

edited

Loading