On Wed, Feb 12, 2025 at 05:08:30PM +0530, Tejas Upadhyay wrote:
Allow user to provide a low latency hint. When set, KMD sends a hint
to GuC which results in special handling for that process. SLPC will
ramp the GT frequency aggressively every time it switches to this
process.
We need to enable the use of SLPC Compute strategy during init, but
it will apply only to processes that set this bit during process
creation.
Improvement with this approach as below:
Before,
:~$ NEOReadDebugKeys=1 EnableDirectSubmission=0 clpeak --kernel-latency
Platform: Intel(R) OpenCL Graphics
Device: Intel(R) Graphics [0xe20b]
Driver version : 24.52.0 (Linux x64)
Compute units : 160
Clock frequency : 2850 MHz
Kernel launch latency : 283.16 us
After,
:~$ NEOReadDebugKeys=1 EnableDirectSubmission=0 clpeak --kernel-latency
Platform: Intel(R) OpenCL Graphics
Device: Intel(R) Graphics [0xe20b]
Driver version : 24.52.0 (Linux x64)
Compute units : 160
Clock frequency : 2850 MHz
Kernel launch latency : 63.38 us
UMD Compute PR : https://github.com/intel/compute-runtime/pull/794
UMD Mesa PR : https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33214
v9(Vinay):
- remove extra line, align commit message
v8(Vinay):
- Add separate example for using low latency hint
v7(Jose):
- Update UMD PR
- applicable to all gpus
V6:
- init flags, remove redundant flags check (MAuld)
V5:
- Move uapi doc to documentation and GuC ABI specific change (Rodrigo)
hmn... that doesn't look right.
...
diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index b75cc9a70d1f..7337d1be45ef 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -583,3 +583,21 @@ dma-buf interoperability
Please see Documentation/userspace-api/dma-buf-alloc-exchange.rst for
information on how dma-buf is integrated and exposed within DRM.
+
+Low latency hint by user
+========================
+
+Allow users to provide a hint to kernel for cases demanding low latency
+profile. Please note it will have impact on power consumption. User can
+indicate low latency hint with flag while creating exec queue as
+mentioned below,
+
+ struct drm_xe_exec_queue_create exec_queue_create = {
+ .flags = DRM_XE_EXEC_QUEUE_LOW_LATENCY_HINT,
+ .extensions = 0,
+ .vm_id = vm,
+ .num_bb_per_exec = 1,
+ .num_eng_per_bb = 1,
+ .instances = to_user_pointer(&instance),
+ };
+ ioctl(fd, DRM_IOCTL_XE_EXEC_QUEUE_CREATE, &exec_queue_create);
how does this driver-specific doc make sense in this file?
Lucas De Marchi