Am 22.06.2017 um 20:23 schrieb Ulrich Pegelow:
OK, that sounds good. There should not be any OpenCL code dealing with non-floating point values at that place. But let me double-check that.


Reply to myself. This is what I get for the preview of an xtrans image:

[pixelpipe_process] [preview] using device 0
[dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [preview]
[dev_pixelpipe] took 0.002 secs (0.002 CPU) processed `raw black/white point' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.001 secs (0.001 CPU) processed `white balance' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.000 secs (0.000 CPU) processed `highlight reconstruction' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.022 secs (0.018 CPU) processed `demosaic' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.002 secs (0.002 CPU) processed `base curve' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.002 secs (0.002 CPU) processed `input color profile' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.002 secs (0.000 CPU) processed `sharpen' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.005 secs (0.000 CPU) processed `highpass' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.002 secs (0.000 CPU) processed `output color profile' on GPU, blended on GPU [preview] [dev_pixelpipe] took 0.008 secs (0.015 CPU) processed `gamma' on CPU, blended on CPU [preview]
[opencl_profiling] profiling device 0 ('GeForce GTX 1060 6GB'):
[opencl_profiling] spent 0.0004 seconds in [Write Image (from host to device)]
[opencl_profiling] spent  0.0001 seconds in rawprepare_1f
[opencl_profiling] spent  0.0001 seconds in whitebalance_1f_xtrans
[opencl_profiling] spent  0.0001 seconds in highlights_1f_clip
[opencl_profiling] spent  0.0002 seconds in markesteijn_initial_copy
[opencl_profiling] spent 0.0008 seconds in [Copy Buffer to Buffer (on device)]
[opencl_profiling] spent  0.0005 seconds in markesteijn_green_minmax
[opencl_profiling] spent  0.0012 seconds in markesteijn_interpolate_green
[opencl_profiling] spent  0.0024 seconds in markesteijn_solitary_green
[opencl_profiling] spent  0.0012 seconds in markesteijn_red_and_blue
[opencl_profiling] spent  0.0006 seconds in markesteijn_interpolate_twoxtwo
[opencl_profiling] spent  0.0011 seconds in markesteijn_convert_yuv
[opencl_profiling] spent  0.0011 seconds in markesteijn_differentiate
[opencl_profiling] spent  0.0003 seconds in markesteijn_homo_threshold
[opencl_profiling] spent  0.0007 seconds in markesteijn_homo_set
[opencl_profiling] spent  0.0007 seconds in markesteijn_homo_sum
[opencl_profiling] spent  0.0002 seconds in markesteijn_homo_max
[opencl_profiling] spent  0.0000 seconds in markesteijn_homo_max_corr
[opencl_profiling] spent  0.0001 seconds in markesteijn_zero
[opencl_profiling] spent  0.0015 seconds in markesteijn_accu
[opencl_profiling] spent  0.0006 seconds in [Copy Image (on device)]
[opencl_profiling] spent  0.0003 seconds in markesteijn_final
[opencl_profiling] spent  0.0002 seconds in vng_border_interpolate
[opencl_profiling] spent  0.0002 seconds in vng_lin_interpolate
[opencl_profiling] spent  0.0005 seconds in vng_interpolate
[opencl_profiling] spent  0.0003 seconds in basecurve_lut
[opencl_profiling] spent  0.0006 seconds in colorin_clipping
[opencl_profiling] spent  0.0003 seconds in sharpen_hblur
[opencl_profiling] spent  0.0003 seconds in sharpen_vblur
[opencl_profiling] spent  0.0004 seconds in sharpen_mix
[opencl_profiling] spent  0.0003 seconds in highpass_invert
[opencl_profiling] spent  0.0009 seconds in highpass_hblur
[opencl_profiling] spent  0.0009 seconds in highpass_vblur
[opencl_profiling] spent  0.0004 seconds in highpass_mix
[opencl_profiling] spent  0.0000 seconds in blendop_set_mask
[opencl_profiling] spent  0.0004 seconds in blendop_Lab
[opencl_profiling] spent  0.0006 seconds in colorout
[opencl_profiling] spent 0.0046 seconds in [Read Image (from device to host)] [opencl_profiling] spent 0.0252 seconds totally in command queue (with 0 events missing)
[dev_process_preview] pixel pipeline processing took 0.078 secs (0.073 CPU)

Certainly indicates some room for improvement. Currently we go the full Markesteijn demosaic way even for the thumbnail. It's not dramatic on this fast device but we could optimize by falling back to VNG or even linear.

That's an issue to be discussed further (but not here in this thread).

Ulrich
___________________________________________________________________________
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org

Reply via email to