[PATCH v2] drm/i915/guc: Flush ct receive tasklet during reset preparation

2024-11-04 Thread Zhanjun Dong
sanitize and set ct->enable to false. This will causes warning on incorrect ct->enable state. (https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12439) Add the missing tasklet flush to flush all 3 parts. Signed-off-by: Zhanjun Dong Reviewed-by: Alan Previn --- drivers/gpu/drm/i915

[PATCH v1] drm/i915/guc: Flush ct receive tasklet during reset preparation

2024-10-30 Thread Zhanjun Dong
all 3 parts. Signed-off-by: Zhanjun Dong Cc: John Harrison Cc: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submis

[PATCH v1] FOR-CI: drm/i915/guc: Disable ct receive tasklet during reset preparation

2024-10-28 Thread Zhanjun Dong
intel_uc_reset_prepare already finished guc sanitize and set ct->enable to false. This will causes warning on incorrect ct->enable state. Fixed by disable ct receive tasklet during reset preparation to avoid the above race condition. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915

[PATCH v2 1/1] drm/i915/guc: Move destroy context at end of reset prepare

2024-10-21 Thread Zhanjun Dong
the context already been destroyed. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index

[PATCH v2 0/1] FOR-CI: drm/i915/guc: Move destroy context at end of reset prepare

2024-10-21 Thread Zhanjun Dong
the context already been destroyed. Signed-off-by: Zhanjun Dong Zhanjun Dong (1): drm/i915/guc: Move destroy context at end of reset prepare drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- 2.34.1

[PATCH v1 1/1] drm/i915/guc: Disable ct during GuC reset

2024-10-18 Thread Zhanjun Dong
During GuC reset prepare, interrupt disabled before hardware reset. Add disable ct to prevent unnecessary message processing. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 3 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 ++ 2 files changed, 5

[PATCH v1 0/1] FOR-CI: drm/i915/guc: Disable ct during GuC reset

2024-10-18 Thread Zhanjun Dong
During GuC reset prepare, interrupt disabled before hardware reset. Add disable ct to prevent unnecessary message processing. Signed-off-by: Zhanjun Dong Zhanjun Dong (1): drm/i915/guc: Disable ct during GuC reset drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 3 +++ drivers/gpu/drm

[PATCH] drm/xe/guc: Extract GuC error capture lists on G2H notification

2024-01-16 Thread Zhanjun Dong
ned-off-by: Zhanjun Dong --- drivers/gpu/drm/xe/abi/guc_actions_abi.h | 7 + drivers/gpu/drm/xe/xe_guc_capture.c | 572 +++ drivers/gpu/drm/xe/xe_guc_ct.c | 2 + drivers/gpu/drm/xe/xe_guc_submit.c | 22 +- drivers/gpu/drm/xe/xe_guc_submit.h | 3

[PATCH] drm/xe/guc: Add XE_LP steered register lists

2024-01-16 Thread Zhanjun Dong
Add the ability for runtime allocation and freeing of steered register list extentions that depend on the detected HW config fuses. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_guc_capture.c | 187 +++- 1 file changed, 185 insertions(+), 2 deletions(-) diff

[PATCH] drm/xe/guc: Check sizing of guc_capture output

2024-01-16 Thread Zhanjun Dong
Add capture output size check function to provide a reasonable minimum size for error capture region before allocating the shared buffer. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_guc_capture.c | 76 + 1 file changed, 76 insertions(+) diff --git a

[PATCH] drm/xe/guc: Expose dss per group for GuC error capture

2024-01-16 Thread Zhanjun Dong
Expose helper for dss per group of mcr, GuC error capture feature need this info to prepare buffer required. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_gt_mcr.c | 4 ++-- drivers/gpu/drm/xe/xe_gt_mcr.h | 1 + drivers/gpu/drm/xe/xe_gt_topology.c | 3

[PATCH] drm/xe/guc: Update GuC ADS size for error capture

2024-01-16 Thread Zhanjun Dong
every engine-class type on the current hardware. Ensure we allocate a persistent store for the register lists that are populated into ADS so that we don't need to allocate memory during GT resets when GuC is reloaded and ADS population happens again. Signed-off-by: Zhanjun Dong --- driver

[PATCH] drm/xe/guc: Add capture size check in GuC log buffer

2024-01-16 Thread Zhanjun Dong
The capture-nodes is included in GuC log buffer, add the size check for capture region in the whole GuC log buffer. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_gt_printk.h | 3 + drivers/gpu/drm/xe/xe_guc_fwif.h | 48 +++ drivers/gpu/drm/xe/xe_guc_log.c | 179

[PATCH] drm/xe/guc: Pre-allocate output nodes for extraction

2024-01-16 Thread Zhanjun Dong
Pre-allocate a fixed number of empty nodes up front (at the time of ADS registration) that we can consume from or return to an internal cached list of nodes. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_guc_capture.c | 83 + 1 file changed, 83 insertions

[PATCH] drm/xe/guc: Plumb GuC-capture into dev coredump

2024-01-16 Thread Zhanjun Dong
. This is reserved for future. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/xe/xe_guc_capture.c | 99 ++- drivers/gpu/drm/xe/xe_guc_capture.h | 10 +++ drivers/gpu/drm/xe/xe_hw_engine.c | 73 - drivers/gpu/drm/xe/xe_hw_engine_types.h | 103

[PATCH] drm/xe/guc: Add register defines for GuC based register capture

2024-01-16 Thread Zhanjun Dong
Add registers defines and list of registers for GuC based error state capture. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/xe/Kconfig | 11 +++ drivers/gpu/drm/xe/Makefile | 1 + drivers/gpu/drm/xe/regs/xe_engine_regs.h | 12 +++ drivers/gpu/drm/xe/regs

[PATCH v2] drm/xe/guc: Add GuC based register capture for error capture

2024-01-16 Thread Zhanjun Dong
. Signed-off-by: Zhanjun Dong Zhanjun Dong (9): drm/xe/guc: Add register defines for GuC based register capture drm/xe/guc: Expose dss per group for GuC error capture drm/xe/guc: Update GuC ADS size for error capture drm/xe/guc: Add XE_LP steered register lists drm/xe/guc: Add capture size

[PATCH v3] drm/i915: Skip pxp init if gt is wedged

2023-11-13 Thread Zhanjun Dong
The gt wedged could be triggered by missing guc firmware file, HW not working, etc. Once triggered, it means all gt usage is dead, therefore we can't enable pxp under this fatal error condition. v2: Updated commit message. v3: Updated return code check. Signed-off-by: Zhanjun Dong --- dr

[PATCH] drm/i915: Skip pxp init if gt is wedged

2023-11-01 Thread Zhanjun Dong
The gt wedged could be triggered by missing guc firmware file, HW not working, etc. Once triggered, it means all gt usage is dead, therefore we can't enable pxp under this fatal error condition. v2: Updated commit message. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/pxp/intel_

[PATCH] drm/i915: Skip pxp init if gt is wedged

2023-10-26 Thread Zhanjun Dong
gt wedged is fatal error, skip the pxp init on this situation. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/pxp/intel_pxp.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index dc327cf40b5a

[PATCH v5] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-08-11 Thread Zhanjun Dong
intel_gt_reset called, reset_in_progress flag will be set, add code to check the flag, call async verion if reset is in progress. Signed-off-by: Zhanjun Dong Cc: John Harrison Cc: Andi Shyti Cc: Daniel Vetter --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 ++- 1 file

[PATCH v4] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-07-27 Thread Zhanjun Dong
asynchronous cancel. v3: Add sync flag to intel_guc_submission_disable to ensure reset path calls asynchronous cancel. v4: Set to always sync from __uc_fini_hw path. Signed-off-by: Zhanjun Dong Cc: John Harrison Cc: Andi Shyti --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 17 +

[PATCH] drm/i915/mtl: Update cache coherency setting for context structure

2023-07-06 Thread Zhanjun Dong
As context structure is shared memory for CPU/GPU, Wa_22016122933 is needed for this memory block as well. Signed-off-by: Zhanjun Dong CC: Fei Yang --- drivers/gpu/drm/i915/gt/intel_lrc.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt

[PATCH] drm/i915/gt: Remove incorrect hard coded cache coherrency setting

2023-06-22 Thread Zhanjun Dong
The previouse i915_gem_object_create_internal already set it with proper value before function return. This hard coded setting is incorrect for platforms like MTL, thus need to be removed. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/intel_timeline.c | 2 -- 1 file changed, 2

[PATCH] drm/i915/gt: Remove incorrect hard coded cache coherrency setting

2023-06-16 Thread Zhanjun Dong
The previouse i915_gem_object_create_internal already set it with proper value before function return. This hard coded setting is incorrect for platforms like MTL, thus need to be removed. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/intel_timeline.c | 2 -- 1 file changed, 2

[PATCH] Remove incorrect hard coded cache coherrency setting

2023-06-15 Thread Zhanjun Dong
The previouse i915_gem_object_create_internal already set it with proper value before function return. This hard coded setting is incorrect for platforms like MTL, thus need to be removed. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/intel_timeline.c | 2 -- 1 file changed, 2

[PATCH v3] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-06-15 Thread Zhanjun Dong
lls asynchronous cancel. v3: Add sync flag to intel_guc_submission_disable to ensure reset path calls asynchronous cancel. Signed-off-by: Zhanjun Dong --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 17 ++--- .../gpu/drm/i915/gt/uc/intel_guc_submission.h | 2 +- drivers/gpu/drm/

[PATCH] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-06-07 Thread Zhanjun Dong
e+0x64/0xe0 #1: 888136c7eab8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write_xsigned.constprop.0+0x47/0x110 #2: 88813e6cce90 (>->reset.mutex){+.+.}-{3:3}, at: intel_gt_reset+0x19e/0x470 [i915] Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/uc/intel_guc_

[PATCH] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-06-05 Thread Zhanjun Dong
30 (sb_writers#15){.+.+}-{0:0}, at: ksys_write+0x64/0xe0 #1: 888136c7eab8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write_xsigned.constprop.0+0x47/0x110 #2: 88813e6cce90 (>->reset.mutex){+.+.}-{3:3}, at: intel_gt_reset+0x19e/0x470 [i915] Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/

[PATCH] drm/i915/guc: Set wedged if enable guc communication failed

2023-04-26 Thread Zhanjun Dong
Add err code check for enable_communication on resume path. When resume failed, we can no longer use the GPU, marking the GPU as wedged. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/intel_gt_pm.c | 7 ++- drivers/gpu/drm/i915/gt/intel_reset.c | 19 --- drivers

[PATCH] drm/i915: Set wedged if enable guc communication failed

2023-03-02 Thread Zhanjun Dong
Add err code check for enable_communication on resume path. When resume failed, we can no longer use the GPU, marking the GPU as wedged. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/intel_gt_pm.c | 7 ++- drivers/gpu/drm/i915/gt/uc/intel_uc.c | 9 +++-- 2 files changed, 13

[PATCH] drm/i915: Set wedged if enable guc communication failed

2023-02-24 Thread Zhanjun Dong
Add err code check for enable_communication on resume path, set wedged if failed. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/intel_gt_pm.c | 5 - drivers/gpu/drm/i915/gt/uc/intel_uc.c | 9 +++-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu

[PATCH] drm/i915/guc: Check for ct enabled while waiting for response

2022-07-16 Thread Zhanjun Dong
ssage into debug message. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 27 +-- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index f0

[PATCH] drm/i915/guc: Check for ct enabled while waiting for response

2022-06-16 Thread Zhanjun Dong
ssage into debug message. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 24 --- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index f0

[PATCH] drm/i915/guc: Check ctx while waiting for response

2022-06-02 Thread Zhanjun Dong
ssage into debug message. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index f01325cd1b62..a3