Hi Bjoern, hm, Global memory size is 5.78GiB and Max allocation size is 2.89GiB. I had a look at darktable's OpenCL code and opencl_memory_requirement from darktablerc is compared against CL_DEVICE_GLOBAL_MEM_SIZE. So in your case that should be 512 MB < 5.78GiB and the device should be used.
The relevant error message should be "[opencl_init] discarding device $DEVICE due to insufficient global memory ($GLOBAL_MEM MB)". Do you see that error message when running darktable with "-d opencl", and if yes, what does it say? I would understand it if the device/driver was blacklisted, or if the system didn't have enough memory, but I'm curious as to why it works when you set opencl_memory_requirement to slightly less than 512 MB. It feels like a rather arbitrary limit. cheers, Simon Am 08.03.19 um 16:22 schrieb Björn Sozumschein: > Hi Simon, > > thank you for the information! > > I'm running Arch Linux by the way. The output of clinfo is: > """ > Number of platforms 1 > Platform Name Intel(R) OpenCL HD > Graphics > Platform Vendor Intel(R) Corporation > Platform Version OpenCL 2.1 > Platform Profile FULL_PROFILE > Platform Extensions cl_khr_3d_image_writes > cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_depth_images > cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics > cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics > cl_khr_local_int32_extended_atomics cl_intel_subgroups > cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir > cl_intel_accelerator cl_intel_media_block_io cl_intel_driver_diagnostics > cl_intel_device_side_avc_motion_estimation cl_khr_priority_hints > cl_khr_throttle_hints cl_khr_create_command_queue cl_khr_fp64 > cl_khr_subgroups cl_khr_il_program > cl_intel_spirv_device_side_avc_motion_estimation > cl_intel_spirv_media_block_io cl_intel_spirv_subgroups > cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_intel_planar_yuv > cl_intel_packed_yuv cl_intel_motion_estimation > cl_intel_advanced_motion_estimation cl_intel_va_api_media_sharing > Platform Host timer resolution 1ns > Platform Extensions function suffix INTEL > > Platform Name Intel(R) OpenCL HD > Graphics > Number of devices 1 > Device Name Intel(R) Gen9 HD > Graphics NEO > Device Vendor Intel(R) Corporation > Device Vendor ID 0x8086 > Device Version OpenCL 2.1 NEO > Driver Version 19.02.12143 > Device OpenCL C Version OpenCL C 2.0 > Device Type GPU > Device Profile FULL_PROFILE > Device Available Yes > Compiler Available Yes > Linker Available Yes > Max compute units 24 > Max clock frequency 1050MHz > Device Partition (core) > Max number of sub-devices 0 > Supported partition types None > Supported affinity domains (n/a) > Max work item dimensions 3 > Max work item sizes 256x256x256 > Max work group size 256 > Preferred work group size multiple 32 > Max sub-groups per work group 32 > Sub-group sizes (Intel) 8, 16, 32 > Preferred / native vector sizes > char 16 / 16 > short 8 / 8 > int 4 / 4 > long 1 / 1 > half 8 / 8 > (cl_khr_fp16) > float 1 / 1 > double 1 / 1 > (cl_khr_fp64) > Half-precision Floating-point support (cl_khr_fp16) > Denormals Yes > Infinity and NANs Yes > Round to nearest Yes > Round to zero Yes > Round to infinity Yes > IEEE754-2008 fused multiply-add Yes > Support is emulated in software No > Single-precision Floating-point support (core) > Denormals Yes > Infinity and NANs Yes > Round to nearest Yes > Round to zero Yes > Round to infinity Yes > IEEE754-2008 fused multiply-add Yes > Support is emulated in software No > Correctly-rounded divide and sqrt operations Yes > Double-precision Floating-point support (cl_khr_fp64) > Denormals Yes > Infinity and NANs Yes > Round to nearest Yes > Round to zero Yes > Round to infinity Yes > IEEE754-2008 fused multiply-add Yes > Support is emulated in software No > Address bits 64, Little-Endian > Global memory size 6206062592 (5.78GiB) > Error Correction support No > Max memory allocation 3103031296 (2.89GiB) > Unified memory for Host and Device Yes > Shared Virtual Memory (SVM) capabilities (core) > Coarse-grained buffer sharing Yes > Fine-grained buffer sharing No > Fine-grained system sharing No > Atomics No > Minimum alignment for any data type 128 bytes > Alignment of base address 1024 bits (128 bytes) > Preferred alignment for atomics > SVM 64 bytes > Global 64 bytes > Local 64 bytes > Max size for global variable 65536 (64KiB) > Preferred total size of global vars 3103031296 (2.89GiB) > Global Memory cache type Read/Write > Global Memory cache size 524288 (512KiB) > Global Memory cache line size 64 bytes > Image support Yes > Max number of samplers per kernel 16 > Max size for 1D images from buffer 193939456 pixels > Max 1D or 2D image array size 2048 images > Base address alignment for 2D image buffers 4 bytes > Pitch alignment for 2D image buffers 4 pixels > Max 2D image size 16384x16384 pixels > Max planar YUV image size 16384x16380 pixels > Max 3D image size 16384x16384x2048 pixels > Max number of read image args 128 > Max number of write image args 128 > Max number of read/write image args 128 > Max number of pipe args 16 > Max active pipe reservations 1 > Max pipe packet size 1024 > Local memory type Local > Local memory size 65536 (64KiB) > Max number of constant args 8 > Max constant buffer size 3103031296 (2.89GiB) > Max size of kernel argument 1024 > Queue properties (on host) > Out-of-order execution Yes > Profiling Yes > Queue properties (on device) > Out-of-order execution Yes > Profiling Yes > Preferred size 131072 (128KiB) > Max size 67108864 (64MiB) > Max queues on device 1 > Max events on device 1024 > Prefer user sync for interop Yes > Profiling timer resolution 83ns > Execution capabilities > Run OpenCL kernels Yes > Run native kernels No > Sub-group independent forward progress Yes > IL version SPIR-V_1.0 > SPIR versions 1.2 > printf() buffer size 4194304 (4MiB) > Built-in kernels > block_motion_estimate_intel;block_advanced_motion_estimate_check_intel;block_advanced_motion_estimate_bidirectional_check_intel; > Motion Estimation accelerator version (Intel) 2 > Device-side AVC Motion Estimation version 1 > Supports texture sampler use Yes > Supports preemption No > Device Extensions cl_khr_3d_image_writes > cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_depth_images > cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics > cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics > cl_khr_local_int32_extended_atomics cl_intel_subgroups > cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir > cl_intel_accelerator cl_intel_media_block_io cl_intel_driver_diagnostics > cl_intel_device_side_avc_motion_estimation cl_khr_priority_hints > cl_khr_throttle_hints cl_khr_create_command_queue cl_khr_fp64 > cl_khr_subgroups cl_khr_il_program > cl_intel_spirv_device_side_avc_motion_estimation > cl_intel_spirv_media_block_io cl_intel_spirv_subgroups > cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_intel_planar_yuv > cl_intel_packed_yuv cl_intel_motion_estimation > cl_intel_advanced_motion_estimation cl_intel_va_api_media_sharing > > NULL platform behavior > clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Intel(R) OpenCL HD > Graphics > clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [INTEL] > clCreateContext(NULL, ...) [default] Success [INTEL] > clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) > Platform Name Intel(R) OpenCL HD > Graphics > Device Name Intel(R) Gen9 HD > Graphics NEO > clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in > platform > clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) > Platform Name Intel(R) OpenCL HD > Graphics > Device Name Intel(R) Gen9 HD > Graphics NEO > clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices > found in platform > clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found > in platform > clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) > Platform Name Intel(R) OpenCL HD > Graphics > Device Name Intel(R) Gen9 HD > Graphics NEO > > ICD loader properties > ICD loader Name OpenCL ICD Loader > ICD loader Vendor OCL Icd free software > ICD loader Version 2.2.12 > ICD loader Profile OpenCL 2.2 > """ > > Best, > Bjoern > > Am Fr., 8. März 2019 um 13:12 Uhr schrieb Sturm Flut > <sturmf...@lieberbiber.de <mailto:sturmf...@lieberbiber.de>>: > > Hi, > > Am 08.03.19 um 09:47 schrieb Björn Sozumschein: > > However, as far as I undestand, although 512 MB VRAM is reported, the > > integrated graphics may allocate more shared memory > > Yes and No. On one hand it depends on the chip generation, operating > system, driver and system configuration. Most of the current generation > Intel GPUs have a hard limit at either 32 GB or half the amount of RAM > installed in the system, whichever is lower (see [1]). On the other hand > the operating system must have free RAM left when the GPU driver wants > to allocate some more. > > Could you maybe post the output of the clinfo command (Linux > distributions have it in their repositories, Windows version here [2] at > the bottom)? On my ASUS notebook with an Intel Graphics 620 and 16 GB of > RAM it reports ~6 GB for "Global memory size" and 2 GB for "Max memory > allocation" when the system is idle. > > I see that most ThinkPad X1 Carbon with the Intel Graphics 520 seem to > have shipped with 8 or 16 GB of RAM. Since the Intel OpenCL driver > really seems to report slightly less than 512 MB in your case, there is > probably a reason for that. > > kind regards, > Simon > > > [1] > > https://www.intel.com/content/www/us/en/support/articles/000020962/graphics-drivers.html > > [2] https://github.com/Oblomov/clinfo > > ___________________________________________________________________________ > darktable developer mailing list to unsubscribe send a mail to > darktable-dev+unsubscr...@lists.darktable.org ___________________________________________________________________________ darktable developer mailing list to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org