[PATCH] drm/amdgpu: Clean up atom header file inclusion

2025-02-04 Thread Lijo Lazar
atom bios header files are not required in these files. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 1 - drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 1 - drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 1 - drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c| 1 - dr

[PATCH] drm/amdgpu: refine smu send msg debug log format

2025-02-04 Thread Yang Wang
remove unnecessary line breaks. [ 51.280860] amdgpu :24:00.0: amdgpu: smu send message: GetEnabledSmuFeaturesHigh(13) param: 0x, resp: 0x0001, readval: 0x3763 Fixes: a364c014a2c1 ("drm/amd/pm: enable amdgpu smu send message log") Signed-off-by: Yang

Re: [PATCH 1/3] drm/amd/pm: Add APIs for device access checks

2025-02-04 Thread Xu, Feifei
Series is Reviewed-by: Feifei Xu On 2/4/2025 2:38 PM, Lijo Lazar wrote: Wrap the checks before device access in helper functions and use them for device access. The generic order of APIs now is to do input argument validation first and check if device access is allowed. Signed-off-by: Lijo Laz

Re: [PATCH 1/3] drm/amd/pm: Add APIs for device access checks

2025-02-04 Thread Lazar, Lijo
On 2/4/2025 12:08 PM, Lijo Lazar wrote: > Wrap the checks before device access in helper functions and use them > for device access. The generic order of APIs now is to do input argument > validation first and check if device access is allowed. > > Signed-off-by: Lijo Lazar > --- > drivers/gpu

Re: [PATCH 2/2] drm/amd/pm: add support for IP version 11.5.2

2025-02-04 Thread Lazar, Lijo
On 2/5/2025 8:18 AM, Ying Li wrote: > This initializes drm/amd/pm version 11.5.2 > > Signed-off-by: YING LI > --- > drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 3 +++ > drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 5 - > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff -

Re: [PATCH 2/2] drm/amd/pm: add support for IP version 11.5.2

2025-02-04 Thread Mario Limonciello
On 2/4/2025 20:48, Ying Li wrote: This initializes drm/amd/pm version 11.5.2 Signed-off-by: YING LI Reviewed-by: Mario Limonciello --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 3 +++ drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 5 - 2 files changed, 7 insertions(+), 1 del

Re: [PATCH 1/2] drm/amdgpu: add support for IP version 11.5.2

2025-02-04 Thread Mario Limonciello
On 2/4/2025 20:48, Ying Li wrote: This initializes drm/amdgpu IP version 11.5.2 Signed-off-by: YING LI Reviewed-by: Mario Limonciello --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c

[PATCH 10/12] drm/amd/display: Support DCN36 HDCP

2025-02-04 Thread Wayne Lin
Add case in hdcp_create_workqueue() to support HDCP on DCN36 as well. Acked-by: Harry Wentland Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c b/driver

[PATCH 12/12] drm/amd/display: Add DCN36 DM Support

2025-02-04 Thread Wayne Lin
Add DM handling for DCN36. Acked-by: Harry Wentland Signed-off-by: Wayne Lin --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdg

[PATCH 09/12] drm/amd/display: Support DCN36 DSC

2025-02-04 Thread Wayne Lin
Add case on clean_up_dsc_blocks() to support DCN36 as well. Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-off-by: Taimur Hassan Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/dr

[PATCH 06/12] drm/amd/display: Add DCN36 GPIO

2025-02-04 Thread Wayne Lin
Add DCN36 support in GPIO. Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-off-by: Taimur Hassan Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/dc/gpio/hw_factory.c | 1 + drivers/gpu/drm/amd/display/dc/gpio/hw_translate.c | 1 + 2 files changed, 2 insertions(+) diff --

Re: [PATCH 10/44] drm/amdgpu/vcn: move more instanced data to vcn_instance

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:57, Alex Deucher wrote: Move more per instance data into the per instance structure. v2: index instances directly on vcn1.0 and 2.0 to make it clear that they only support a single instance (Lijo) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |

[PATCH 05/12] drm/amd/display: Add DCN36 Resource

2025-02-04 Thread Wayne Lin
Add resource handling for DCN36. V2: adjust copyright license text V3: remove unnecessary headers Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-off-by: Taimur Hassan Signed-off-by: Wayne Lin --- .../gpu/drm/amd/display/dc/resource/Makefile |8 + .../dc/resource/dcn36/dcn36_r

[PATCH 1/2] drm/amdgpu: add support for IP version 11.5.2

2025-02-04 Thread Ying Li
This initializes drm/amdgpu IP version 11.5.2 Signed-off-by: YING LI --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 1 + drivers/gpu/drm/amd/amdgpu/nv.c | 1 + drivers/g

[PATCH 07/12] drm/amd/display: Add DCN36 DML2 support

2025-02-04 Thread Wayne Lin
Enable DML2 for DCN36. Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-off-by: Taimur Hassan Signed-off-by: Wayne Lin --- .../gpu/drm/amd/display/dc/dml2/display_mode_core_structs.h | 1 + drivers/gpu/drm/amd/display/dc/dml2/dml2_policy.c | 1 + drivers/gpu/drm/amd/dis

[PATCH 02/12] drm/amd/display: Add DCN36 version identifiers

2025-02-04 Thread Wayne Lin
Add DCN3.6 asic identifiers. Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-off-by: Taimur Hassan Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/dc/dc_helper.c| 2 ++ drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 1 + drivers/gpu/drm/amd/display/include/dal_

[PATCH 08/12] drm/amd/display: Add DCN36 DMCUB

2025-02-04 Thread Wayne Lin
DMCU-B (Display Micro-Controller Unit B) is a display microcontroller used for shared display functionality with BIOS and for advanced power saving display features. Add case to support DCN3.6 as well. V2: adjust copyright license text Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-

[PATCH 03/12] drm/amd/display: Add DCN36 BIOS command table support

2025-02-04 Thread Wayne Lin
Add case for DCN36 in command_table_helper2.c. Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-off-by: Taimur Hassan Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/dc/bios/command_table_helper2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/displ

[PATCH 04/12] drm/amd/display: Add DCN36 IRQ

2025-02-04 Thread Wayne Lin
Add IRQ services for DCN36. This allows us to create/init and manage irqs for DCN3 V2: adjust copyright license text Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-off-by: Taimur Hassan Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/dc/irq/Makefile | 9 + .../display

[PATCH 00/12] Patch set to support dcn36

2025-02-04 Thread Wayne Lin
This patchset brings support for dcn36. --- Wayne Lin (12): drm/amd/display: Add dcn36 register header files drm/amd/display: Add DCN36 version identifiers drm/amd/display: Add DCN36 BIOS command table support drm/amd/display: Add DCN36 IRQ drm/amd/display: Add DCN36 Resource drm/amd/

[PATCH 11/12] drm/amd/display: Add DCN36 CORE

2025-02-04 Thread Wayne Lin
Add DCN36 support in dc_resource.c. Acked-by: Harry Wentland Reviewed-by: Martin Leung Signed-off-by: Taimur Hassan Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_r

[PATCH 2/2] drm/amd/pm: add support for IP version 11.5.2

2025-02-04 Thread Ying Li
This initializes drm/amd/pm version 11.5.2 Signed-off-by: YING LI --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 3 +++ drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 5 - 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drive

Re: [PATCH 09/44] drm/amdgpu/vcn: make powergating status per instance

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:57, Alex Deucher wrote: Store it per instance so we can track it per instance. v2: index instances directly on vcn1.0 and 2.0 to make it clear that they only support a single instance (Lijo) Signed-off-by: Alex Deucher Reviewed-by: Boyuan Zhang

Re: [PATCH] drm/amd: Refactor find_system_memory()

2025-02-04 Thread Felix Kuehling
On 2025-02-04 17:21, Mario Limonciello wrote: > From: Mario Limonciello > > find_system_memory() pulls out two fields from an SMBIOS type 17 > device and sets them on KFD devices. This however is pulling from > the middle of the field in the SMBIOS device and leads to an unaligned > access. >

[PATCH] drm/amdgpu: Set snoop bit for SDMA for MI series

2025-02-04 Thread Harish Kasiviswanathan
SDMA writes has to probe invalidate RW lines. Set snoop bit in mmhub for this to happen. Signed-off-by: Harish Kasiviswanathan --- drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 25 ++ drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 27 +++ drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.

[PATCH] drm/amd: Refactor find_system_memory()

2025-02-04 Thread Mario Limonciello
From: Mario Limonciello find_system_memory() pulls out two fields from an SMBIOS type 17 device and sets them on KFD devices. This however is pulling from the middle of the field in the SMBIOS device and leads to an unaligned access. Instead use a struct representation to access the members and

Re: [PATCH 08/44] drm/amdgpu/vcn: switch work handler to be per instance

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:57, Alex Deucher wrote: Have a separate work handler for each VCN instance. This paves the way for per instance VCN power gating at runtime. v2: index instances directly on vcn1.0 and 2.0 to make it clear that they only support a single instance (Lijo) Signed-off-by: Alex Deuc

Re: [PATCH v12 4/5] drm/i915: Use device wedged event

2025-02-04 Thread Tvrtko Ursulin
On 04/02/2025 17:24, Rodrigo Vivi wrote: On Tue, Feb 04, 2025 at 12:35:27PM +0530, Raag Jadav wrote: Now that we have device wedged event provided by DRM core, make use of it and support both driver rebind and bus-reset based recovery. With this in place, userspace will be notified of wedged d

Re: [PATCH v12 3/5] drm/xe: Use device wedged event

2025-02-04 Thread Rodrigo Vivi
On Tue, Feb 04, 2025 at 12:35:26PM +0530, Raag Jadav wrote: > This was previously attempted as xe specific reset uevent but dropped > in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now") > as part of refactoring. > > Now that we have device wedged event provided by DRM core, make us

Re: [PATCH v12 4/5] drm/i915: Use device wedged event

2025-02-04 Thread Rodrigo Vivi
On Tue, Feb 04, 2025 at 12:35:27PM +0530, Raag Jadav wrote: > Now that we have device wedged event provided by DRM core, make use > of it and support both driver rebind and bus-reset based recovery. > With this in place, userspace will be notified of wedged device on > gt reset failure. > > Signed

Re: Graphical lockups on 7900XTX while using 6.13.1 or 6.13

2025-02-04 Thread Alex Deucher
On Tue, Feb 4, 2025 at 4:17 AM Sean Behan wrote: > > Hi, > > I get a graphical lockup while playing, opening, or closing games on > both 6.13 and 6.13.1. The issue is not present if I switch to 6.12.12. > > A dmesg of the crash is attached. The GPU is hanging. Can you file a bug report: https://

Re: [PATCH 06/44] drm/amdgpu/vcn5.0.0: split code along instances

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:57, Alex Deucher wrote: Split the code on a per instance basis. This will allow us to use the per instance functions in the future to handle more things per instance. Signed-off-by: Alex Deucher Reviewed-by: Boyuan Zhang --- drivers/gpu

Re: [PATCH 07/44] drm/amdgpu/vcn5.0.1: split code along instances

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:57, Alex Deucher wrote: Split the code on a per instance basis. This will allow us to use the per instance functions in the future to handle more things per instance. Signed-off-by: Alex Deucher Reviewed-by: Boyuan Zhang --- drivers/gpu

Re: [PATCH 05/44] drm/amdgpu/vcn4.0.5: split code along instances

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:57, Alex Deucher wrote: Split the code on a per instance basis. This will allow us to use the per instance functions in the future to handle more things per instance. Signed-off-by: Alex Deucher Reviewed-by: Boyuan Zhang --- drivers/gpu

Re: [PATCH 04/44] drm/amdgpu/vcn4.0.3: split code along instances

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:57, Alex Deucher wrote: Split the code on a per instance basis. This will allow us to use the per instance functions in the future to handle more things per instance. Signed-off-by: Alex Deucher Reviewed-by: Boyuan Zhang --- drivers/gpu

Re: [PATCH v12 1/5] drm: Introduce device wedged event

2025-02-04 Thread Christian König
Am 04.02.25 um 08:05 schrieb Raag Jadav: Introduce device wedged event, which notifies userspace of 'wedged' (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is no longer operating as expected and has become unrecoverable from driver

Re: [PATCH 03/44] drm/amdgpu/vcn4.0: split code along instances

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:56, Alex Deucher wrote: Split the code on a per instance basis. This will allow us to use the per instance functions in the future to handle more things per instance. Signed-off-by: Alex Deucher Reviewed-by: Boyuan Zhang --- drivers/gpu

Re: [PATCH 02/44] drm/amdgpu/vcn3.0: split code along instances

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:56, Alex Deucher wrote: Split the code on a per instance basis. This will allow us to use the per instance functions in the future to handle more things per instance. Signed-off-by: Alex Deucher Reviewed-by: Boyuan Zhang --- drivers/gpu

Re: [PATCH 01/44] drm/amdgpu/vcn2.5: split code along instances

2025-02-04 Thread Boyuan Zhang
On 2025-01-31 11:56, Alex Deucher wrote: Split the code on a per instance basis. This will allow us to use the per instance functions in the future to handle more things per instance. Signed-off-by: Alex Deucher Reviewed-by: Boyuan Zhang --- drivers/gpu

Re: [PATCH 5/5] drm/amdgpu: rework gfx10 queue reset

2025-02-04 Thread Alex Deucher
On Tue, Feb 4, 2025 at 9:57 AM Christian König wrote: > > Apply the same changes to gfx10 as done to gfx9. > > The general idea to reset the whole kernel queue and then asking the kiq > to map it again didn't worked at all. Background is that we don't use per > application kernel queues for gfx10

Re: [PATCH 2/5] drm/amdgpu: rework gfx9 queue reset

2025-02-04 Thread Alex Deucher
On Tue, Feb 4, 2025 at 9:48 AM Christian König wrote: > > Testing this feature turned out that it was a bit unstable. The > CP_VMID_RESET register takes the VMID which all submissions from should > be canceled. > > Unlike Windows Linux uses per process VMIDs instead of per engine VMIDs > for the s

Re: [PATCH 5/5] drm/amdgpu: rework gfx10 queue reset

2025-02-04 Thread Alex Deucher
On Tue, Feb 4, 2025 at 9:57 AM Christian König wrote: > > Apply the same changes to gfx10 as done to gfx9. > > The general idea to reset the whole kernel queue and then asking the kiq > to map it again didn't worked at all. Background is that we don't use per > application kernel queues for gfx10

Re: [PATCH 4/5] drm/amdgpu: rework gfx8 queue reset

2025-02-04 Thread Alex Deucher
On Tue, Feb 4, 2025 at 9:48 AM Christian König wrote: > > Apply the same changes to gfx8 as done to gfx9. > > Untested and probably needs some more work. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 86 --- > 1 file changed, 38 insert

Re: [PATCH 3/5] drm/amdgpu: rework gfx7 queue reset

2025-02-04 Thread Alex Deucher
On Tue, Feb 4, 2025 at 9:48 AM Christian König wrote: > > Apply the same changes to gfx7 as done to gfx9. > > Untested and probably needs some more work. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 89 --- > 1 file changed, 39 insert

[PATCH 5/5] drm/amdgpu: rework gfx10 queue reset

2025-02-04 Thread Christian König
Apply the same changes to gfx10 as done to gfx9. The general idea to reset the whole kernel queue and then asking the kiq to map it again didn't worked at all. Background is that we don't use per application kernel queues for gfx10 on Linux for performance reasons. So instead use the gfx9 approac

[PATCH 3/5] drm/amdgpu: rework gfx7 queue reset

2025-02-04 Thread Christian König
Apply the same changes to gfx7 as done to gfx9. Untested and probably needs some more work. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 89 --- 1 file changed, 39 insertions(+), 50 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v

Rework and fix queue reset for gfx7-gfx10

2025-02-04 Thread Christian König
Hi guys, I finally found time to work on queue reset a bit more and also gave it some more testing. The per VMID reset actually seems to work 100% reliable at least on gfx9. What still could be is that an application is using multiple VMIDs on the graphics ring or that we re-use the same VMID fo

[PATCH 4/5] drm/amdgpu: rework gfx8 queue reset

2025-02-04 Thread Christian König
Apply the same changes to gfx8 as done to gfx9. Untested and probably needs some more work. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 86 --- 1 file changed, 38 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v

[PATCH 2/5] drm/amdgpu: rework gfx9 queue reset

2025-02-04 Thread Christian König
Testing this feature turned out that it was a bit unstable. The CP_VMID_RESET register takes the VMID which all submissions from should be canceled. Unlike Windows Linux uses per process VMIDs instead of per engine VMIDs for the simple reason that we don't have enough. So resetting one VMID only k

[PATCH 1/5] drm/amdgpu: rework queue reset scheduler interaction

2025-02-04 Thread Christian König
Stopping the scheduler for queue reset is generally a good idea because it prevents any worker from touching the ring buffer. But using amdgpu_fence_driver_force_completion() before restarting it was a really bad idea because it marked fences as failed while the work was potentially still running.

Re: [PATCH v3 0/3] drm/amdgpu: Explicit sync for GEM VA operations

2025-02-04 Thread Alex Deucher
On Tue, Feb 4, 2025 at 8:37 AM Christian König wrote: > > Hi Friedrich, > > adding Alex. > > Am 04.02.25 um 13:32 schrieb Friedrich Vock: > > Hi, > > > > On 19.08.24 13:21, Christian König wrote: > >> Am 19.08.24 um 09:21 schrieb Friedrich Vock: > >>> In Vulkan, it is the application's responsibil

Re: [PATCH v3 0/3] drm/amdgpu: Explicit sync for GEM VA operations

2025-02-04 Thread Christian König
Hi Friedrich, adding Alex. Am 04.02.25 um 13:32 schrieb Friedrich Vock: Hi, On 19.08.24 13:21, Christian König wrote: Am 19.08.24 um 09:21 schrieb Friedrich Vock: In Vulkan, it is the application's responsibility to perform adequate synchronization before a sparse unmap, replace or BO destro

Re: [PATCH v3 0/3] drm/amdgpu: Explicit sync for GEM VA operations

2025-02-04 Thread Friedrich Vock
Hi, On 19.08.24 13:21, Christian König wrote: Am 19.08.24 um 09:21 schrieb Friedrich Vock: In Vulkan, it is the application's responsibility to perform adequate synchronization before a sparse unmap, replace or BO destroy operation. This adds an option to AMDGPU_VA_OPs to disable redundant impl

Re: [PATCH 1/6] drm/amdgpu: grab an additional reference on the gang fence

2025-02-04 Thread Michel Dänzer
On 2025-02-03 12:58, Christian König wrote: > We keep the gang submission fence around in adev, make sure that it > stays alive. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/a

[PATCH v12 4/5] drm/i915: Use device wedged event

2025-02-04 Thread Raag Jadav
Now that we have device wedged event provided by DRM core, make use of it and support both driver rebind and bus-reset based recovery. With this in place, userspace will be notified of wedged device on gt reset failure. Signed-off-by: Raag Jadav Reviewed-by: Aravind Iddamsetty --- drivers/gpu/d

[PATCH v12 5/5] drm/amdgpu: Use device wedged event

2025-02-04 Thread Raag Jadav
From: André Almeida Use DRM's device wedged event to notify userspace that a reset had happened. For now, only use `none` method meant for telemetry capture. In the future we might want to report a recovery method if the reset didn't succeed. Acked-by: Shashank Sharma Signed-off-by: André Alme

[PATCH v12 1/5] drm: Introduce device wedged event

2025-02-04 Thread Raag Jadav
Introduce device wedged event, which notifies userspace of 'wedged' (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is no longer operating as expected and has become unrecoverable from driver context. Purpose of this implementation is

[PATCH v12 2/5] drm/doc: Document device wedged event

2025-02-04 Thread Raag Jadav
Add documentation for device wedged event in a new "Device wedging" chapter. This describes basic definitions, prerequisites and consumer expectations along with an example. v8: Improve introduction (Christian, Rodrigo) v9: Add prerequisites section (Christian) v10: Clarify mmap cleanup and cons

[PATCH v12 0/5] Introduce DRM device wedged event

2025-02-04 Thread Raag Jadav
This series introduces device wedged event in DRM subsystem and uses it in xe, i915 and amdgpu drivers. Detailed description in commit message. This was earlier attempted as xe specific uevent in v1 and v2 on [1]. Similar work by André Almeida on [2]. Wedged event support for amdgpu by André Almei

[PATCH v12 3/5] drm/xe: Use device wedged event

2025-02-04 Thread Raag Jadav
This was previously attempted as xe specific reset uevent but dropped in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now") as part of refactoring. Now that we have device wedged event provided by DRM core, make use of it and support both driver rebind and bus-reset based recovery. W