Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU mode not working #75

Open
jzrolling opened this issue Sep 5, 2024 · 9 comments
Open

GPU mode not working #75

jzrolling opened this issue Sep 5, 2024 · 9 comments
Labels
win11 Windows 11 specific

Comments

@jzrolling
Copy link

Hey Erik,

I was using the pre-compiled windows version (v0.4.3) and it works well albeit slowly with just CPU. When I switched to GPU (I have an RTX4060 and drivers installed), it would wait for about 2 seconds before saving a log file to dst and finishing without exporting a deconvolved image. Here's a copy of the log and I couldn't spot an error. Also I didn't notice any significant change in GPU usage. Am I missing anything here? Any input will be greatly appreciated, thanks!!

Best,

Junhao

P.S. Super nice work!!

2024-09-05T14:55:07

Settings:
image: C:\Users\Zlab\Desktop\test3.tif
psf: C:\Users\Zlab\Desktop\PSF.tiff
output: C:\Users\Zlab\Desktop\dwgpu_test3.tif
log file: C:\Users\Zlab\Desktop\dwgpu_test3.tif.log.txt
nIter: 5
nThreads for FFT: 14
nThreads for OMP: 14
verbosity: 1
background level: auto
method: Scaled Heavy Ball + OpenCL (SHBCL2)
metric: Idiv
Stopping after 5 iterations
overwrite: NO
tiling: OFF
XY crop factor: 0.001000
Offset: 5.000000
Output Format: 16 bit integer
Scaling: Automatic
Border Quality: 2 Minimal boundary artifacts
FFT lookahead: 0
FFTW3 plan: FFTW_MEASURE
Initial guess: Flat average
deconwolf: '0.4.3'
PWD: C:\Program Files (x86)\deconwolf
CMD: C:\Program Files (x86)\deconwolf\dw.exe --iter 5 --gpu --prefix dwgpu C:\Users\Zlab\Desktop\test3.tif C:\Users\Zlab\Desktop\PSF.tiff
BUILD_DATE: 'Jun 22 2024'
TIFF Backend: 'LIBTIFF, Version 4.6.0
Copyright (c) 1988-1996 Sam Leffler
Copyright (c) 1991-1996 Silicon Graphics, Inc.'
OpenMP: YES
OpenCL: YES
VkFFT: YES

@elgw
Copy link
Owner

elgw commented Sep 6, 2024

Hello Junhao,

Thank you for reporting!

The log file looks normal so it looks this smells like a bug. Sorry for that.

It would be really useful to me if you could run with --verbose 2 and report back what you see in the terminal before it crashes. Please also check that dw does not use up all the RAM of the computer, the GPU version is more memory hungry than the CPU only version.

I must warn that I don't have access to any windows machine at the moment (can't test the GPU code path in a virtual machine) so it might take some time before I have a chance to solve this.

Cheers,
Erik

@yifanChengUSTC
Copy link

I had the same problem,the terminal returned a negative value to me, I don't know what went wrong

@elgw
Copy link
Owner

elgw commented Oct 11, 2024

Hello!

I need a little more information to figure out what is going on. Please add the command line argument --verbose 2 when you run dw and then paste the output here.

Cheers,
Erik

@elgw elgw changed the title GPU mode not working GPU mode not working [Windows] Oct 11, 2024
@elgw elgw changed the title GPU mode not working [Windows] GPU mode not working Oct 11, 2024
@elgw elgw added the win11 Windows 11 specific label Oct 11, 2024
@yifanChengUSTC
Copy link

yifanChengUSTC commented Oct 15, 2024

This is my output

Settings: 
image:  D:\16\16.tif 
psf:    D:\16\16psf\GFP-bandpass.tif 
output: D:\16\16dw\16_30.tif 
log file: D:\16\16dw\16_30.tif.log.txt 
nIter:  30 
nThreads for FFT: 4 
nThreads for OMP: 4 
verbosity: 2 
background level: auto 
method: Scaled Heavy Ball + OpenCL (SHBCL2) 
metric: MSE 
Stopping after 30 iterations 
overwrite: NO 
tiling, maxSize: 3000 
tiling, padding: 20 
XY crop factor: 0.001000 
Offset: 5.000000 
Output Format: 16 bit integer 
Scaling: Automatic 
Border Quality: 2 Minimal boundary artifacts 
FFT lookahead: 0 
FFTW3 plan: FFTW_MEASURE 
Initial guess: Flat average 
 
deconwolf: '0.4.3' 
BUILD_DATE: 'Jun 22 2024' 
TIFF Backend: 'LIBTIFF, Version 4.6.0 
Copyright (c) 1988-1996 Sam Leffler 
Copyright (c) 1991-1996 Silicon Graphics, Inc.' 
OpenMP: YES 
OpenCL: YES 
VkFFT: YES 
sizeof(int) = 4 
sizeof(float) = 4 
sizeof(double) = 8 
sizeof(size_t) = 8 
 
Image dimensions: 2048 x 2048 x 12 
Reading D:\16\16.tif 
Reading D:\16\16psf\GFP-bandpass.tif 
PSF Z-crop [181 x 181 x 39] -> [181 x 181 x 23] 
PSF X-crop: Not cropping 
Output: D:\16\16dw\16_30.tif(.log.txt) 
Deconvolving using shbcl2 (using inplace) 
Setting the background level to 105.000000 
image: [2048x2048x12], psf: [181x181x23], job: [2228x2228x34] 
Found 1 CL platforms 
Found 1 CL devices 
Will use device 0 (first = 0) 

ans =

  -1.0737e+09

@elgw
Copy link
Owner

elgw commented Oct 15, 2024

Thanks, that will give me some hint on where to look. Normally it would spit out some info about the graphics card like this:

image: [101x201x40], psf: [179x179x79], job: [279x379x118]
Found 1 CL platforms
Found 1 CL devices
Will use device 0 (first = 0)
CL device #0
CL_DEVICE_TYPE=CL_DEVICE_TYPE_GPU
CL_DEVICE_GLOBAL_MEM_SIZE = 12868124672 (12868 MiB)
CL_DEVICE_NAME = gfx1031
CL_DEVICE_VENDOR = Advanced Micro Devices, Inc.
CL_DRIVER_VERSION = 3614.0 (HSA1.1,LC)
CL_DEVICE_EXTENSIONS = cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program 

What graphics card do you use and are you able to use other programs relying on OpenCL?

Where does the last part come from, i.e.:

ans =

  -1.0737e+09

are you using the program from MATLAB? If that is the case see if it works better when running directly from the power shell.

@yifanChengUSTC
Copy link

My graphics card is GTX1650Ti,and I just run the command line in MATLAB,such as system(['dw_bw --lambda 750.000000 --NA 0.950 --ni 1.000 --resxy 161.0 --resz 1500.0 "' file_path 'psf\Cy7-bandpass.tif" '] ) system(['dw --verbose 2 --mse --gpu --tilesize 3000 --iter 35 --out ' temp_out_name ' "' file '" "' file_path 'psf\Cy5-bandpass.tif"' ]).I guessed at first that my graphics card wouldn't drive, but another friend in the comments had the same problem, maybe it was a bug

@JB-Git-15
Copy link

JB-Git-15 commented Dec 17, 2024

Hi I actually managed to run it successfully on the GPU. Today it stopped working unexpectedly. Could you help me ?
Best regards
Jacques

Settings:
image: /home/jacques/Documents/FISH/Data_analysis/pipeline_smfish/2024-10-18_hek-xpo1/Analysis/temp/2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff
psf: /home/jacques/Documents/FISH/Data_analysis/pipeline_smfish/2024-10-18_hek-xpo1/Analysis/temp_psf/Psf_DAPI_460_z300_xy107_na14_ni15.tif
output: ../Analysis/temp/dw_2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff
log file: ../Analysis/temp/dw_2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff.log.txt
nIter: 25
nThreads for FFT: 24
nThreads for OMP: 24
verbosity: 2
background level: auto
method: Scaled Heavy Ball + OpenCL (SHBCL2)
metric: Idiv
Stopping after 25 iterations
overwrite: YES
tiling, maxSize: 1024
tiling, padding: 20
XY crop factor: 0.001000
Offset: 5.000000
Output Format: 16 bit integer
Scaling: Automatic
Border Quality: 2 Minimal boundary artifacts
FFT lookahead: 0
FFTW3 plan: FFTW_MEASURE
Initial guess: Flat average

deconwolf: '0.4.3'
BUILD_DATE: 'Nov 26 2024'
FFT Backend: 'fftw-3.3.10-sse2-avx'
TIFF Backend: 'LIBTIFF, Version 4.7.0
Copyright (c) 1988-1996 Sam Leffler
Copyright (c) 1991-1996 Silicon Graphics, Inc.'
OpenMP: YES
OpenCL: YES
VkFFT: YES
sizeof(int) = 4
sizeof(float) = 4
sizeof(double) = 8
sizeof(size_t) = 8

Image dimensions: 2048 x 2048 x 42
Reading /home/jacques/Documents/FISH/Data_analysis/pipeline_smfish/2024-10-18_hek-xpo1/Analysis/temp_psf/Psf_DAPI_460_z300_xy107_na14_ni15.tif
PSF Z-crop [181 x 181 x 183] -> [181 x 181 x 83]
PSF X-crop: Not cropping
Output: ../Analysis/temp/dw_2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff(.log.txt)
using fftw-3.3.10-sse2-avx with 24 threads
-> Divided the [2048 x 2048 x 42] image into 4 tiles
Initializing ../Analysis/temp/dw_2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff.raw to 0
Dumping /home/jacques/Documents/FISH/Data_analysis/pipeline_smfish/2024-10-18_hek-xpo1/Analysis/temp/2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff to /home/jacques/Documents/FISH/Data_analysis/pipeline_smfish/2024-10-18_hek-xpo1/Analysis/temp/2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff.raw (for quicker io)

-> Processing tile 1 / 4
PSF X-crop: Not cropping
Deconvolving using shbcl2 (using inplace)
Setting the background level to 278.000000
image: [1044x1044x42], psf: [181x181x83], job: [1224x1224x124]
OpenCl error=CL_PLATFORM_NOT_FOUND_KHR
Error running command: Command '['dw', '--iter', '25', '../Analysis/temp/2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff', '../Analysis/temp_psf/Psf_DAPI_460_z300_xy107_na14_ni15.tif', '--overwrite', '--verbose', '2', '--gpu', '--tilesize', '1024', '--out', '../Analysis/temp/dw_2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff']' returned non-zero exit status 1.
Result of deconvolution: file ../Analysis/temp/dw_2020_11_12_HEK293nT_XPO1_RIGI_IFIH-1_18well_ibidi.lif-A1_XPO1_treated_1_DAPI.tiff created. Execution time: 1.61 seconds.

Sorry! There was an unrecoverable error!
File: /home/jacques/Documents/Programmes/Deconwolf/deconwolf/src/cl_util.c
Function: clu_new at line 1275

If you are sure that OpenCL works on this machine
and that it is a problem only related to deconwolf,
check open issues or create a new one at
https://github.com/elgw/deconwolf/issues

@elgw
Copy link
Owner

elgw commented Dec 18, 2024

Hi Jacques,

The error message from OpenCL is CL_PLATFORM_NOT_FOUND_KHR, which suggests that there is a problem with the GPU driver (did you update something?). Please check what you get from the command clinfo.

By default deconwolf will use the first device from the first platform. That is probably fine for most simple setups, like my computer. Here that command says:

$ clinfo
Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3614.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback 
  Platform Extensions function suffix             AMD
  Platform Host timer resolution                  1ns

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     gfx1031
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 2.0 
  Driver Version                                  3614.0 (HSA1.1,LC)
  Device OpenCL C Version                         OpenCL C 2.0 
  Device Type                                     GPU
  Device` Board Name (AMD)                         AMD Radeon RX 6700 XT
...

If you have more than one device to select from you can specify which to use with the command line argument --cldevice.

Cheers,
Erik

@JB-Git-15
Copy link

Hi Erik, thanks a lot !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
win11 Windows 11 specific
Projects
None yet
Development

No branches or pull requests

4 participants