Hi,

Here are my numbers… first thing to say is that it «feels» faster.


System is CPU: i7-4790K, GPU : GeForce GTX 960 (2GB Ram)


With HEAD:

==================
With OpenCL:

1 pass, no color smoothing

[dev] took 0.000 secs (0.000 CPU) to load the image.
[export] creating pixelpipe took 0.009 secs (0.017 CPU)
[pixelpipe_process] [export] using device 0
[dev_pixelpipe] took 0.000 secs (-0.000 CPU) initing base buffer [export]
[dev_pixelpipe] took 0.011 secs (0.027 CPU) processed `raw black/white
point' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0.028 secs (0.110 CPU) processed `white balance' on
CPU, blended on CPU [export]
[dev_pixelpipe] took 0.005 secs (0.037 CPU) processed `highlight
reconstruction' on CPU, blended on CPU [export]
[default_process_tiling_cl_ptp] use tiling on module 'demosaic' for
image with full size 4936 x 3296
[default_process_tiling_cl_ptp] (3 x 1) tiles with max dimensions 2184 x
3296 and overlap 12
[default_process_tiling_cl_ptp] tile (0, 0) with 2184 x 3296 at origin
[0, 0]
[default_process_tiling_cl_ptp] tile (1, 0) with 2184 x 3296 at origin
[2160, 0]
[default_process_tiling_cl_ptp] tile (2, 0) with 616 x 3296 at origin
[4320, 0]
[dev_pixelpipe] took 0.421 secs (0.773 CPU) processed `demosaic' on GPU
with tiling, blended on CPU [export]
[default_process_tiling_cl_roi] use tiling on module 'lens' for image
with full input size 4936 x 3296
[default_process_tiling_cl_roi] (2 x 1) tiles with max input dimensions
3085 x 3296
[default_process_tiling_cl_roi] tile (0, 0) with 2476 x 3296 at origin
[0, 0]
[default_process_tiling_cl_roi] tile (1, 0) with 2476 x 3296 at origin
[2460, 0]
[dev_pixelpipe] took 0.087 secs (0.193 CPU) processed `lens correction'
on GPU with tiling, blended on CPU [export]
[dev_pixelpipe] took 0.046 secs (0.107 CPU) processed `base curve' on
GPU, blended on GPU [export]
[dev_pixelpipe] took 0.009 secs (0.027 CPU) processed `input color
profile' on GPU, blended on GPU [export]
[dev_pixelpipe] took 1.161 secs (8.483 CPU) processed `output color
profile' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.022 secs (0.113 CPU) processed `gamma' on CPU,
blended on CPU [export]
[opencl_profiling] spent  0.0863 seconds in [Write Image (from host to
device)]
[opencl_profiling] spent  0.0017 seconds in rawprepare_1f
[opencl_profiling] spent  0.1325 seconds in [Read Image (from device to
host)]
[opencl_profiling] spent  0.0067 seconds in markesteijn_initial_copy
[opencl_profiling] spent  0.0229 seconds in [Copy Buffer to Buffer (on
device)]
[opencl_profiling] spent  0.0146 seconds in markesteijn_green_minmax
[opencl_profiling] spent  0.0351 seconds in markesteijn_interpolate_green
[opencl_profiling] spent  0.0539 seconds in markesteijn_solitary_green
[opencl_profiling] spent  0.0347 seconds in markesteijn_red_and_blue
[opencl_profiling] spent  0.0152 seconds in markesteijn_interpolate_twoxtwo
[opencl_profiling] spent  0.0285 seconds in markesteijn_convert_yuv
[opencl_profiling] spent  0.0266 seconds in markesteijn_differentiate
[opencl_profiling] spent  0.0084 seconds in markesteijn_homo_threshold
[opencl_profiling] spent  0.0183 seconds in markesteijn_homo_set
[opencl_profiling] spent  0.0176 seconds in markesteijn_homo_sum
[opencl_profiling] spent  0.0040 seconds in markesteijn_homo_max
[opencl_profiling] spent  0.0008 seconds in markesteijn_homo_max_corr
[opencl_profiling] spent  0.0032 seconds in markesteijn_zero
[opencl_profiling] spent  0.0454 seconds in markesteijn_accu
[opencl_profiling] spent  0.0071 seconds in markesteijn_final
[opencl_profiling] spent  0.0072 seconds in [Copy Image (on device)]
[opencl_profiling] spent  0.0013 seconds in vng_lin_interpolate
[opencl_profiling] spent  0.0053 seconds in vng_interpolate
[opencl_profiling] spent  0.0008 seconds in vng_border_interpolate
[opencl_profiling] spent  0.0108 seconds in basecurve
[opencl_profiling] spent  0.0071 seconds in colorin_unbound
[opencl_profiling] spent  0.5960 seconds totally in command queue (with
0 events missing)
[dev_process_export] pixel pipeline processing took 1.790 secs (9.870 CPU)
[export_job] exported to
`/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1_01.jpg'

3 pass, 5 times color smoothing

[dev] took 0.000 secs (-0.000 CPU) to load the image.
[export] creating pixelpipe took 0.010 secs (0.020 CPU)
[pixelpipe_process] [export] using device 0
[dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
[dev_pixelpipe] took 0.007 secs (0.017 CPU) processed `raw black/white
point' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0.026 secs (0.120 CPU) processed `white balance' on
CPU, blended on CPU [export]
[dev_pixelpipe] took 0.005 secs (0.027 CPU) processed `highlight
reconstruction' on CPU, blended on CPU [export]
[default_process_tiling_cl_ptp] use tiling on module 'demosaic' for
image with full size 4936 x 3296
[default_process_tiling_cl_ptp] (2 x 2) tiles with max dimensions 2616 x
1746 and overlap 12
[default_process_tiling_cl_ptp] tile (0, 0) with 2616 x 1746 at origin
[0, 0]
[default_process_tiling_cl_ptp] tile (0, 1) with 2616 x 1574 at origin
[0, 1722]
[default_process_tiling_cl_ptp] tile (1, 0) with 2344 x 1746 at origin
[2592, 0]
[default_process_tiling_cl_ptp] tile (1, 1) with 2344 x 1574 at origin
[2592, 1722]
[dev_pixelpipe] took 0.954 secs (1.370 CPU) processed `demosaic' on GPU
with tiling, blended on CPU [export]
[default_process_tiling_cl_roi] use tiling on module 'lens' for image
with full input size 4936 x 3296
[default_process_tiling_cl_roi] (2 x 1) tiles with max input dimensions
3085 x 3296
[default_process_tiling_cl_roi] tile (0, 0) with 2476 x 3296 at origin
[0, 0]
[default_process_tiling_cl_roi] tile (1, 0) with 2476 x 3296 at origin
[2460, 0]
[dev_pixelpipe] took 0.081 secs (0.080 CPU) processed `lens correction'
on GPU with tiling, blended on CPU [export]
[dev_pixelpipe] took 0.038 secs (0.043 CPU) processed `base curve' on
GPU, blended on GPU [export]
[dev_pixelpipe] took 0.009 secs (0.010 CPU) processed `input color
profile' on GPU, blended on GPU [export]
[dev_pixelpipe] took 1.116 secs (8.237 CPU) processed `output color
profile' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.019 secs (0.117 CPU) processed `gamma' on CPU,
blended on CPU [export]
[opencl_profiling] spent  0.0761 seconds in [Write Image (from host to
device)]
[opencl_profiling] spent  0.0015 seconds in rawprepare_1f
[opencl_profiling] spent  0.1176 seconds in [Read Image (from device to
host)]
[opencl_profiling] spent  0.0058 seconds in markesteijn_initial_copy
[opencl_profiling] spent  0.0466 seconds in [Copy Buffer to Buffer (on
device)]
[opencl_profiling] spent  0.0117 seconds in markesteijn_green_minmax
[opencl_profiling] spent  0.0426 seconds in markesteijn_interpolate_green
[opencl_profiling] spent  0.1651 seconds in markesteijn_solitary_green
[opencl_profiling] spent  0.0893 seconds in markesteijn_red_and_blue
[opencl_profiling] spent  0.0943 seconds in markesteijn_interpolate_twoxtwo
[opencl_profiling] spent  0.0639 seconds in markesteijn_recalculate_green
[opencl_profiling] spent  0.0534 seconds in markesteijn_convert_yuv
[opencl_profiling] spent  0.0542 seconds in markesteijn_differentiate
[opencl_profiling] spent  0.0193 seconds in markesteijn_homo_threshold
[opencl_profiling] spent  0.0333 seconds in markesteijn_homo_set
[opencl_profiling] spent  0.0381 seconds in markesteijn_homo_sum
[opencl_profiling] spent  0.0076 seconds in markesteijn_homo_max
[opencl_profiling] spent  0.0008 seconds in markesteijn_homo_max_corr
[opencl_profiling] spent  0.0046 seconds in markesteijn_homo_quench
[opencl_profiling] spent  0.0031 seconds in markesteijn_zero
[opencl_profiling] spent  0.0842 seconds in markesteijn_accu
[opencl_profiling] spent  0.0064 seconds in markesteijn_final
[opencl_profiling] spent  0.0143 seconds in [Copy Image (on device)]
[opencl_profiling] spent  0.0014 seconds in vng_lin_interpolate
[opencl_profiling] spent  0.0054 seconds in vng_interpolate
[opencl_profiling] spent  0.0008 seconds in vng_border_interpolate
[opencl_profiling] spent  0.0414 seconds in color_smoothing
[opencl_profiling] spent  0.0073 seconds in basecurve
[opencl_profiling] spent  0.0072 seconds in colorin_unbound
[opencl_profiling] spent  1.0972 seconds totally in command queue (with
0 events missing)
[dev_process_export] pixel pipeline processing took 2.256 secs (10.020 CPU)
[export_job] exported to
`/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1.jpg'

==================
Without OpenCL:

One pass:

[dev] took 0.000 secs (0.000 CPU) to load the image.
[export] creating pixelpipe took 0.009 secs (0.013 CPU)
[pixelpipe_process] [export] using device -1
[dev_pixelpipe] took 0.000 secs (-0.000 CPU) initing base buffer [export]
[dev_pixelpipe] took 0.006 secs (0.037 CPU) processed `raw black/white
point' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.013 secs (0.083 CPU) processed `white balance' on
CPU, blended on CPU [export]
[dev_pixelpipe] took 0.006 secs (0.033 CPU) processed `highlight
reconstruction' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.425 secs (3.253 CPU) processed `demosaic' on CPU,
blended on CPU [export]
[dev_pixelpipe] took 0.035 secs (0.073 CPU) processed `lens correction'
on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.038 secs (0.217 CPU) processed `base curve' on
CPU, blended on CPU [export]
[dev_pixelpipe] took 0.037 secs (0.247 CPU) processed `input color
profile' on CPU, blended on CPU [export]
[dev_pixelpipe] took 1.185 secs (8.370 CPU) processed `output color
profile' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.018 secs (0.127 CPU) processed `gamma' on CPU,
blended on CPU [export]
[dev_process_export] pixel pipeline processing took 1.763 secs (12.440 CPU)
[export_job] exported to
`/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1_02.jpg'

3 pass:

[export] creating pixelpipe took 0.009 secs (0.007 CPU)
[pixelpipe_process] [export] using device -1
[dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
[dev_pixelpipe] took 0.006 secs (0.023 CPU) processed `raw black/white
point' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.017 secs (0.080 CPU) processed `white balance' on
CPU, blended on CPU [export]
[dev_pixelpipe] took 0.006 secs (0.033 CPU) processed `highlight
reconstruction' on CPU, blended on CPU [export]
[dev_pixelpipe] took 2.028 secs (13.267 CPU) processed `demosaic' on CPU
with tiling, blended on CPU [export]
[dev_pixelpipe] took 0.031 secs (0.023 CPU) processed `lens correction'
on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.033 secs (0.217 CPU) processed `base curve' on
CPU, blended on CPU [export]
[dev_pixelpipe] took 0.033 secs (0.213 CPU) processed `input color
profile' on CPU, blended on CPU [export]
[dev_pixelpipe] took 1.102 secs (8.213 CPU) processed `output color
profile' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.022 secs (0.120 CPU) processed `gamma' on CPU,
blended on CPU [export]
[dev_process_export] pixel pipeline processing took 3.277 secs (22.190 CPU)
[export_job] exported to
`/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1.jpg'

==================
For reference, with OpenCL, with darktable 2.0, 3 pass, 5 times color
smoothing:

[export] creating pixelpipe took 0.009 secs (0.017 CPU)
[pixelpipe_process] [export] using device 0
[dev_pixelpipe] took 0.000 secs (-0.000 CPU) initing base buffer [export]
[dev_pixelpipe] took 0.007 secs (0.010 CPU) processed `raw black/white
point' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0.028 secs (0.113 CPU) processed `white balance' on
CPU, blended on CPU [export]
[dev_pixelpipe] took 0.005 secs (0.030 CPU) processed `highlight
reconstruction' on CPU, blended on CPU [export]
[dev_pixelpipe] took 2.082 secs (13.793 CPU) processed `demosaic' on
CPU, blended on CPU [export]
[default_process_tiling_cl_roi] use tiling on module 'lens' for image
with full input size 4936 x 3296
[default_process_tiling_cl_roi] (2 x 1) tiles with max input dimensions
3085 x 3296
[default_process_tiling_cl_roi] tile (0, 0) with 2476 x 3296 at origin
[0, 0]
[default_process_tiling_cl_roi] tile (1, 0) with 2476 x 3296 at origin
[2460, 0]
[dev_pixelpipe] took 0.080 secs (0.090 CPU) processed `lens correction'
on GPU with tiling, blended on CPU [export]
[dev_pixelpipe] took 0.039 secs (0.050 CPU) processed `base curve' on
GPU, blended on GPU [export]
[dev_pixelpipe] took 0.010 secs (-0.000 CPU) processed `input color
profile' on GPU, blended on GPU [export]
[dev_pixelpipe] took 1.152 secs (8.230 CPU) processed `output color
profile' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0.023 secs (0.117 CPU) processed `gamma' on CPU,
blended on CPU [export]
[opencl_profiling] spent  0.0684 seconds in [Write Image (from host to
device)]
[opencl_profiling] spent  0.0017 seconds in rawprepare_1f
[opencl_profiling] spent  0.0786 seconds in [Read Image (from device to
host)]
[opencl_profiling] spent  0.0065 seconds in [Copy Image (on device)]
[opencl_profiling] spent  0.0074 seconds in basecurve
[opencl_profiling] spent  0.0079 seconds in colorin_unbound
[opencl_profiling] spent  0.1705 seconds totally in command queue (with
0 events missing)
[dev_process_export] pixel pipeline processing took 3.427 secs (22.433 CPU)
[export_job] exported to
`/home/marc/Downloads/darktable_exported/2015_12/2015-12-31_20_04_13_XE025304-1.jpg'


The GUI also feels more responsive in darkroom… around half a second to
render with a few more elements in the stack.

[pixelpipe_process] [full] using device 0
[dev_pixelpipe] took 0.001 secs (0.000 CPU) initing base buffer [full]
[dev_pixelpipe] took 0.002 secs (0.003 CPU) processed `raw black/white
point' on GPU, blended on GPU [full]
[dev_pixelpipe] took 0.005 secs (0.017 CPU) processed `white balance' on
CPU, blended on CPU [full]
[dev_pixelpipe] took 0.136 secs (0.103 CPU) processed `demosaic' on GPU,
blended on GPU [full]
[dev_pixelpipe] took 0.039 secs (0.030 CPU) processed `denoise
(profiled)' on GPU, blended on GPU [full]
[dev_pixelpipe] took 0.002 secs (0.000 CPU) processed `exposure' on GPU,
blended on GPU [full]
[dev_pixelpipe] took 0.002 secs (0.000 CPU) processed `lens correction'
on GPU, blended on GPU [full]
[dev_pixelpipe] took 0.002 secs (0.003 CPU) processed `base curve' on
GPU, blended on GPU [full]
[dev_pixelpipe] took 0.003 secs (-0.000 CPU) processed `input color
profile' on GPU, blended on GPU [full]
[dev_pixelpipe] took 0.187 secs (1.113 CPU) processed `defringe' on CPU,
blended on CPU [full]
[dev_pixelpipe] took 0.005 secs (0.003 CPU) processed `lowlight vision'
on GPU, blended on GPU [full]
[dev_pixelpipe] took 0.003 secs (0.003 CPU) processed `tone curve' on
GPU, blended on GPU [full]
[dev_pixelpipe] took 0.005 secs (0.003 CPU) processed `sharpen' on GPU,
blended on GPU [full]
[dev_pixelpipe] took 0.110 secs (0.680 CPU) processed `output color
profile' on CPU, blended on CPU [full]
[dev_pixelpipe] took 0.004 secs (0.013 CPU) processed `gamma' on CPU,
blended on CPU [full]
[opencl_profiling] spent  0.0058 seconds in [Write Image (from host to
device)]
[opencl_profiling] spent  0.0002 seconds in rawprepare_1f
[opencl_profiling] spent  0.0086 seconds in [Read Image (from device to
host)]
[opencl_profiling] spent  0.0008 seconds in markesteijn_initial_copy
[opencl_profiling] spent  0.0086 seconds in [Copy Buffer to Buffer (on
device)]
[opencl_profiling] spent  0.0017 seconds in markesteijn_green_minmax
[opencl_profiling] spent  0.0057 seconds in markesteijn_interpolate_green
[opencl_profiling] spent  0.0238 seconds in markesteijn_solitary_green
[opencl_profiling] spent  0.0118 seconds in markesteijn_red_and_blue
[opencl_profiling] spent  0.0128 seconds in markesteijn_interpolate_twoxtwo
[opencl_profiling] spent  0.0086 seconds in markesteijn_recalculate_green
[opencl_profiling] spent  0.0079 seconds in markesteijn_convert_yuv
[opencl_profiling] spent  0.0076 seconds in markesteijn_differentiate
[opencl_profiling] spent  0.0025 seconds in markesteijn_homo_threshold
[opencl_profiling] spent  0.0044 seconds in markesteijn_homo_set
[opencl_profiling] spent  0.0046 seconds in markesteijn_homo_sum
[opencl_profiling] spent  0.0009 seconds in markesteijn_homo_max
[opencl_profiling] spent  0.0001 seconds in markesteijn_homo_max_corr
[opencl_profiling] spent  0.0005 seconds in markesteijn_homo_quench
[opencl_profiling] spent  0.0004 seconds in markesteijn_zero
[opencl_profiling] spent  0.0114 seconds in markesteijn_accu
[opencl_profiling] spent  0.0009 seconds in markesteijn_final
[opencl_profiling] spent  0.0018 seconds in [Copy Image (on device)]
[opencl_profiling] spent  0.0003 seconds in vng_lin_interpolate
[opencl_profiling] spent  0.0010 seconds in vng_interpolate
[opencl_profiling] spent  0.0002 seconds in vng_border_interpolate
[opencl_profiling] spent  0.0049 seconds in color_smoothing
[opencl_profiling] spent  0.0069 seconds in interpolation_resample
[opencl_profiling] spent  0.0008 seconds in denoiseprofile_precondition
[opencl_profiling] spent  0.0241 seconds in denoiseprofile_decompose
[opencl_profiling] spent  0.0026 seconds in denoiseprofile_reduce_first
[opencl_profiling] spent  0.0000 seconds in denoiseprofile_reduce_second
[opencl_profiling] spent  0.0000 seconds in [Read Buffer (from device to
host)]
[opencl_profiling] spent  0.0070 seconds in denoiseprofile_synthesize
[opencl_profiling] spent  0.0008 seconds in denoiseprofile_backtransform
[opencl_profiling] spent  0.0001 seconds in blendop_set_mask
[opencl_profiling] spent  0.0014 seconds in blendop_rgb
[opencl_profiling] spent  0.0008 seconds in exposure
[opencl_profiling] spent  0.0008 seconds in basecurve
[opencl_profiling] spent  0.0009 seconds in colorin_unbound
[opencl_profiling] spent  0.0009 seconds in lowlight
[opencl_profiling] spent  0.0013 seconds in tonecurve
[opencl_profiling] spent  0.0010 seconds in sharpen_hblur
[opencl_profiling] spent  0.0011 seconds in sharpen_vblur
[opencl_profiling] spent  0.0013 seconds in sharpen_mix
[opencl_profiling] spent  0.1896 seconds totally in command queue (with
0 events missing)
[dev_process_image] pixel pipeline processing took 0.506 secs (1.973 CPU)


Regards
___________________________________________________________________________
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org

Reply via email to