Hi, Here are my numbers… first thing to say is that it «feels» faster.
System is CPU: i7-4790K, GPU : GeForce GTX 960 (2GB Ram) With HEAD: ================== With OpenCL: 1 pass, no color smoothing [dev] took 0.000 secs (0.000 CPU) to load the image. [export] creating pixelpipe took 0.009 secs (0.017 CPU) [pixelpipe_process] [export] using device 0 [dev_pixelpipe] took 0.000 secs (-0.000 CPU) initing base buffer [export] [dev_pixelpipe] took 0.011 secs (0.027 CPU) processed `raw black/white point' on GPU, blended on GPU [export] [dev_pixelpipe] took 0.028 secs (0.110 CPU) processed `white balance' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.005 secs (0.037 CPU) processed `highlight reconstruction' on CPU, blended on CPU [export] [default_process_tiling_cl_ptp] use tiling on module 'demosaic' for image with full size 4936 x 3296 [default_process_tiling_cl_ptp] (3 x 1) tiles with max dimensions 2184 x 3296 and overlap 12 [default_process_tiling_cl_ptp] tile (0, 0) with 2184 x 3296 at origin [0, 0] [default_process_tiling_cl_ptp] tile (1, 0) with 2184 x 3296 at origin [2160, 0] [default_process_tiling_cl_ptp] tile (2, 0) with 616 x 3296 at origin [4320, 0] [dev_pixelpipe] took 0.421 secs (0.773 CPU) processed `demosaic' on GPU with tiling, blended on CPU [export] [default_process_tiling_cl_roi] use tiling on module 'lens' for image with full input size 4936 x 3296 [default_process_tiling_cl_roi] (2 x 1) tiles with max input dimensions 3085 x 3296 [default_process_tiling_cl_roi] tile (0, 0) with 2476 x 3296 at origin [0, 0] [default_process_tiling_cl_roi] tile (1, 0) with 2476 x 3296 at origin [2460, 0] [dev_pixelpipe] took 0.087 secs (0.193 CPU) processed `lens correction' on GPU with tiling, blended on CPU [export] [dev_pixelpipe] took 0.046 secs (0.107 CPU) processed `base curve' on GPU, blended on GPU [export] [dev_pixelpipe] took 0.009 secs (0.027 CPU) processed `input color profile' on GPU, blended on GPU [export] [dev_pixelpipe] took 1.161 secs (8.483 CPU) processed `output color profile' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.022 secs (0.113 CPU) processed `gamma' on CPU, blended on CPU [export] [opencl_profiling] spent 0.0863 seconds in [Write Image (from host to device)] [opencl_profiling] spent 0.0017 seconds in rawprepare_1f [opencl_profiling] spent 0.1325 seconds in [Read Image (from device to host)] [opencl_profiling] spent 0.0067 seconds in markesteijn_initial_copy [opencl_profiling] spent 0.0229 seconds in [Copy Buffer to Buffer (on device)] [opencl_profiling] spent 0.0146 seconds in markesteijn_green_minmax [opencl_profiling] spent 0.0351 seconds in markesteijn_interpolate_green [opencl_profiling] spent 0.0539 seconds in markesteijn_solitary_green [opencl_profiling] spent 0.0347 seconds in markesteijn_red_and_blue [opencl_profiling] spent 0.0152 seconds in markesteijn_interpolate_twoxtwo [opencl_profiling] spent 0.0285 seconds in markesteijn_convert_yuv [opencl_profiling] spent 0.0266 seconds in markesteijn_differentiate [opencl_profiling] spent 0.0084 seconds in markesteijn_homo_threshold [opencl_profiling] spent 0.0183 seconds in markesteijn_homo_set [opencl_profiling] spent 0.0176 seconds in markesteijn_homo_sum [opencl_profiling] spent 0.0040 seconds in markesteijn_homo_max [opencl_profiling] spent 0.0008 seconds in markesteijn_homo_max_corr [opencl_profiling] spent 0.0032 seconds in markesteijn_zero [opencl_profiling] spent 0.0454 seconds in markesteijn_accu [opencl_profiling] spent 0.0071 seconds in markesteijn_final [opencl_profiling] spent 0.0072 seconds in [Copy Image (on device)] [opencl_profiling] spent 0.0013 seconds in vng_lin_interpolate [opencl_profiling] spent 0.0053 seconds in vng_interpolate [opencl_profiling] spent 0.0008 seconds in vng_border_interpolate [opencl_profiling] spent 0.0108 seconds in basecurve [opencl_profiling] spent 0.0071 seconds in colorin_unbound [opencl_profiling] spent 0.5960 seconds totally in command queue (with 0 events missing) [dev_process_export] pixel pipeline processing took 1.790 secs (9.870 CPU) [export_job] exported to `/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1_01.jpg' 3 pass, 5 times color smoothing [dev] took 0.000 secs (-0.000 CPU) to load the image. [export] creating pixelpipe took 0.010 secs (0.020 CPU) [pixelpipe_process] [export] using device 0 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export] [dev_pixelpipe] took 0.007 secs (0.017 CPU) processed `raw black/white point' on GPU, blended on GPU [export] [dev_pixelpipe] took 0.026 secs (0.120 CPU) processed `white balance' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.005 secs (0.027 CPU) processed `highlight reconstruction' on CPU, blended on CPU [export] [default_process_tiling_cl_ptp] use tiling on module 'demosaic' for image with full size 4936 x 3296 [default_process_tiling_cl_ptp] (2 x 2) tiles with max dimensions 2616 x 1746 and overlap 12 [default_process_tiling_cl_ptp] tile (0, 0) with 2616 x 1746 at origin [0, 0] [default_process_tiling_cl_ptp] tile (0, 1) with 2616 x 1574 at origin [0, 1722] [default_process_tiling_cl_ptp] tile (1, 0) with 2344 x 1746 at origin [2592, 0] [default_process_tiling_cl_ptp] tile (1, 1) with 2344 x 1574 at origin [2592, 1722] [dev_pixelpipe] took 0.954 secs (1.370 CPU) processed `demosaic' on GPU with tiling, blended on CPU [export] [default_process_tiling_cl_roi] use tiling on module 'lens' for image with full input size 4936 x 3296 [default_process_tiling_cl_roi] (2 x 1) tiles with max input dimensions 3085 x 3296 [default_process_tiling_cl_roi] tile (0, 0) with 2476 x 3296 at origin [0, 0] [default_process_tiling_cl_roi] tile (1, 0) with 2476 x 3296 at origin [2460, 0] [dev_pixelpipe] took 0.081 secs (0.080 CPU) processed `lens correction' on GPU with tiling, blended on CPU [export] [dev_pixelpipe] took 0.038 secs (0.043 CPU) processed `base curve' on GPU, blended on GPU [export] [dev_pixelpipe] took 0.009 secs (0.010 CPU) processed `input color profile' on GPU, blended on GPU [export] [dev_pixelpipe] took 1.116 secs (8.237 CPU) processed `output color profile' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.019 secs (0.117 CPU) processed `gamma' on CPU, blended on CPU [export] [opencl_profiling] spent 0.0761 seconds in [Write Image (from host to device)] [opencl_profiling] spent 0.0015 seconds in rawprepare_1f [opencl_profiling] spent 0.1176 seconds in [Read Image (from device to host)] [opencl_profiling] spent 0.0058 seconds in markesteijn_initial_copy [opencl_profiling] spent 0.0466 seconds in [Copy Buffer to Buffer (on device)] [opencl_profiling] spent 0.0117 seconds in markesteijn_green_minmax [opencl_profiling] spent 0.0426 seconds in markesteijn_interpolate_green [opencl_profiling] spent 0.1651 seconds in markesteijn_solitary_green [opencl_profiling] spent 0.0893 seconds in markesteijn_red_and_blue [opencl_profiling] spent 0.0943 seconds in markesteijn_interpolate_twoxtwo [opencl_profiling] spent 0.0639 seconds in markesteijn_recalculate_green [opencl_profiling] spent 0.0534 seconds in markesteijn_convert_yuv [opencl_profiling] spent 0.0542 seconds in markesteijn_differentiate [opencl_profiling] spent 0.0193 seconds in markesteijn_homo_threshold [opencl_profiling] spent 0.0333 seconds in markesteijn_homo_set [opencl_profiling] spent 0.0381 seconds in markesteijn_homo_sum [opencl_profiling] spent 0.0076 seconds in markesteijn_homo_max [opencl_profiling] spent 0.0008 seconds in markesteijn_homo_max_corr [opencl_profiling] spent 0.0046 seconds in markesteijn_homo_quench [opencl_profiling] spent 0.0031 seconds in markesteijn_zero [opencl_profiling] spent 0.0842 seconds in markesteijn_accu [opencl_profiling] spent 0.0064 seconds in markesteijn_final [opencl_profiling] spent 0.0143 seconds in [Copy Image (on device)] [opencl_profiling] spent 0.0014 seconds in vng_lin_interpolate [opencl_profiling] spent 0.0054 seconds in vng_interpolate [opencl_profiling] spent 0.0008 seconds in vng_border_interpolate [opencl_profiling] spent 0.0414 seconds in color_smoothing [opencl_profiling] spent 0.0073 seconds in basecurve [opencl_profiling] spent 0.0072 seconds in colorin_unbound [opencl_profiling] spent 1.0972 seconds totally in command queue (with 0 events missing) [dev_process_export] pixel pipeline processing took 2.256 secs (10.020 CPU) [export_job] exported to `/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1.jpg' ================== Without OpenCL: One pass: [dev] took 0.000 secs (0.000 CPU) to load the image. [export] creating pixelpipe took 0.009 secs (0.013 CPU) [pixelpipe_process] [export] using device -1 [dev_pixelpipe] took 0.000 secs (-0.000 CPU) initing base buffer [export] [dev_pixelpipe] took 0.006 secs (0.037 CPU) processed `raw black/white point' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.013 secs (0.083 CPU) processed `white balance' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.006 secs (0.033 CPU) processed `highlight reconstruction' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.425 secs (3.253 CPU) processed `demosaic' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.035 secs (0.073 CPU) processed `lens correction' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.038 secs (0.217 CPU) processed `base curve' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.037 secs (0.247 CPU) processed `input color profile' on CPU, blended on CPU [export] [dev_pixelpipe] took 1.185 secs (8.370 CPU) processed `output color profile' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.018 secs (0.127 CPU) processed `gamma' on CPU, blended on CPU [export] [dev_process_export] pixel pipeline processing took 1.763 secs (12.440 CPU) [export_job] exported to `/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1_02.jpg' 3 pass: [export] creating pixelpipe took 0.009 secs (0.007 CPU) [pixelpipe_process] [export] using device -1 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export] [dev_pixelpipe] took 0.006 secs (0.023 CPU) processed `raw black/white point' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.017 secs (0.080 CPU) processed `white balance' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.006 secs (0.033 CPU) processed `highlight reconstruction' on CPU, blended on CPU [export] [dev_pixelpipe] took 2.028 secs (13.267 CPU) processed `demosaic' on CPU with tiling, blended on CPU [export] [dev_pixelpipe] took 0.031 secs (0.023 CPU) processed `lens correction' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.033 secs (0.217 CPU) processed `base curve' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.033 secs (0.213 CPU) processed `input color profile' on CPU, blended on CPU [export] [dev_pixelpipe] took 1.102 secs (8.213 CPU) processed `output color profile' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.022 secs (0.120 CPU) processed `gamma' on CPU, blended on CPU [export] [dev_process_export] pixel pipeline processing took 3.277 secs (22.190 CPU) [export_job] exported to `/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1.jpg' ================== For reference, with OpenCL, with darktable 2.0, 3 pass, 5 times color smoothing: [export] creating pixelpipe took 0.009 secs (0.017 CPU) [pixelpipe_process] [export] using device 0 [dev_pixelpipe] took 0.000 secs (-0.000 CPU) initing base buffer [export] [dev_pixelpipe] took 0.007 secs (0.010 CPU) processed `raw black/white point' on GPU, blended on GPU [export] [dev_pixelpipe] took 0.028 secs (0.113 CPU) processed `white balance' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.005 secs (0.030 CPU) processed `highlight reconstruction' on CPU, blended on CPU [export] [dev_pixelpipe] took 2.082 secs (13.793 CPU) processed `demosaic' on CPU, blended on CPU [export] [default_process_tiling_cl_roi] use tiling on module 'lens' for image with full input size 4936 x 3296 [default_process_tiling_cl_roi] (2 x 1) tiles with max input dimensions 3085 x 3296 [default_process_tiling_cl_roi] tile (0, 0) with 2476 x 3296 at origin [0, 0] [default_process_tiling_cl_roi] tile (1, 0) with 2476 x 3296 at origin [2460, 0] [dev_pixelpipe] took 0.080 secs (0.090 CPU) processed `lens correction' on GPU with tiling, blended on CPU [export] [dev_pixelpipe] took 0.039 secs (0.050 CPU) processed `base curve' on GPU, blended on GPU [export] [dev_pixelpipe] took 0.010 secs (-0.000 CPU) processed `input color profile' on GPU, blended on GPU [export] [dev_pixelpipe] took 1.152 secs (8.230 CPU) processed `output color profile' on CPU, blended on CPU [export] [dev_pixelpipe] took 0.023 secs (0.117 CPU) processed `gamma' on CPU, blended on CPU [export] [opencl_profiling] spent 0.0684 seconds in [Write Image (from host to device)] [opencl_profiling] spent 0.0017 seconds in rawprepare_1f [opencl_profiling] spent 0.0786 seconds in [Read Image (from device to host)] [opencl_profiling] spent 0.0065 seconds in [Copy Image (on device)] [opencl_profiling] spent 0.0074 seconds in basecurve [opencl_profiling] spent 0.0079 seconds in colorin_unbound [opencl_profiling] spent 0.1705 seconds totally in command queue (with 0 events missing) [dev_process_export] pixel pipeline processing took 3.427 secs (22.433 CPU) [export_job] exported to `/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1.jpg' The GUI also feels more responsive in darkroom… around half a second to render with a few more elements in the stack. [pixelpipe_process] [full] using device 0 [dev_pixelpipe] took 0.001 secs (0.000 CPU) initing base buffer [full] [dev_pixelpipe] took 0.002 secs (0.003 CPU) processed `raw black/white point' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.005 secs (0.017 CPU) processed `white balance' on CPU, blended on CPU [full] [dev_pixelpipe] took 0.136 secs (0.103 CPU) processed `demosaic' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.039 secs (0.030 CPU) processed `denoise (profiled)' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.002 secs (0.000 CPU) processed `exposure' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.002 secs (0.000 CPU) processed `lens correction' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.002 secs (0.003 CPU) processed `base curve' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.003 secs (-0.000 CPU) processed `input color profile' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.187 secs (1.113 CPU) processed `defringe' on CPU, blended on CPU [full] [dev_pixelpipe] took 0.005 secs (0.003 CPU) processed `lowlight vision' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.003 secs (0.003 CPU) processed `tone curve' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.005 secs (0.003 CPU) processed `sharpen' on GPU, blended on GPU [full] [dev_pixelpipe] took 0.110 secs (0.680 CPU) processed `output color profile' on CPU, blended on CPU [full] [dev_pixelpipe] took 0.004 secs (0.013 CPU) processed `gamma' on CPU, blended on CPU [full] [opencl_profiling] spent 0.0058 seconds in [Write Image (from host to device)] [opencl_profiling] spent 0.0002 seconds in rawprepare_1f [opencl_profiling] spent 0.0086 seconds in [Read Image (from device to host)] [opencl_profiling] spent 0.0008 seconds in markesteijn_initial_copy [opencl_profiling] spent 0.0086 seconds in [Copy Buffer to Buffer (on device)] [opencl_profiling] spent 0.0017 seconds in markesteijn_green_minmax [opencl_profiling] spent 0.0057 seconds in markesteijn_interpolate_green [opencl_profiling] spent 0.0238 seconds in markesteijn_solitary_green [opencl_profiling] spent 0.0118 seconds in markesteijn_red_and_blue [opencl_profiling] spent 0.0128 seconds in markesteijn_interpolate_twoxtwo [opencl_profiling] spent 0.0086 seconds in markesteijn_recalculate_green [opencl_profiling] spent 0.0079 seconds in markesteijn_convert_yuv [opencl_profiling] spent 0.0076 seconds in markesteijn_differentiate [opencl_profiling] spent 0.0025 seconds in markesteijn_homo_threshold [opencl_profiling] spent 0.0044 seconds in markesteijn_homo_set [opencl_profiling] spent 0.0046 seconds in markesteijn_homo_sum [opencl_profiling] spent 0.0009 seconds in markesteijn_homo_max [opencl_profiling] spent 0.0001 seconds in markesteijn_homo_max_corr [opencl_profiling] spent 0.0005 seconds in markesteijn_homo_quench [opencl_profiling] spent 0.0004 seconds in markesteijn_zero [opencl_profiling] spent 0.0114 seconds in markesteijn_accu [opencl_profiling] spent 0.0009 seconds in markesteijn_final [opencl_profiling] spent 0.0018 seconds in [Copy Image (on device)] [opencl_profiling] spent 0.0003 seconds in vng_lin_interpolate [opencl_profiling] spent 0.0010 seconds in vng_interpolate [opencl_profiling] spent 0.0002 seconds in vng_border_interpolate [opencl_profiling] spent 0.0049 seconds in color_smoothing [opencl_profiling] spent 0.0069 seconds in interpolation_resample [opencl_profiling] spent 0.0008 seconds in denoiseprofile_precondition [opencl_profiling] spent 0.0241 seconds in denoiseprofile_decompose [opencl_profiling] spent 0.0026 seconds in denoiseprofile_reduce_first [opencl_profiling] spent 0.0000 seconds in denoiseprofile_reduce_second [opencl_profiling] spent 0.0000 seconds in [Read Buffer (from device to host)] [opencl_profiling] spent 0.0070 seconds in denoiseprofile_synthesize [opencl_profiling] spent 0.0008 seconds in denoiseprofile_backtransform [opencl_profiling] spent 0.0001 seconds in blendop_set_mask [opencl_profiling] spent 0.0014 seconds in blendop_rgb [opencl_profiling] spent 0.0008 seconds in exposure [opencl_profiling] spent 0.0008 seconds in basecurve [opencl_profiling] spent 0.0009 seconds in colorin_unbound [opencl_profiling] spent 0.0009 seconds in lowlight [opencl_profiling] spent 0.0013 seconds in tonecurve [opencl_profiling] spent 0.0010 seconds in sharpen_hblur [opencl_profiling] spent 0.0011 seconds in sharpen_vblur [opencl_profiling] spent 0.0013 seconds in sharpen_mix [opencl_profiling] spent 0.1896 seconds totally in command queue (with 0 events missing) [dev_process_image] pixel pipeline processing took 0.506 secs (1.973 CPU) Regards ___________________________________________________________________________ darktable developer mailing list to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org